CALCULUS 

OF 

SEVERAL VARIABLES 


SERGE LANG 

Yale University 


▲ 

▼T 


ADDISON-WESLEY PUBLISHING COMPANY 
Reading, Massachusetts • Menlo Park, California • London • Don Mills, Ontario 


This book is in the 

ADDISON-WESLEY SERIES IN MATHEMATICS 

Lynn H. Loomis 
Consulting Editor 


Cover photograph of a thunderstorm by Ernst Haas. Photograph appears in 
The Creation published by The Viking Press, Inc., 1971. 


Copyright © 1973, 1968 by Addison-Wesley Publishing Company, Inc. Philippines 
copyright 1973 by Addison-Wesley Publishing Company, Inc. 

All rights reserved. No part of this publication may be reproduced, stored in a retrieval 
system, or transmitted, in any form or by any means, electronic, mechanical, photo¬ 
copying, recording, or otherwise, without the prior written permission of the publisher. 
Printed in the United States of America. Published simultaneously in Canada. Library 
of Congress Catalog Card No. 74-183671 


MA 


Foreword 


The present course on calculus of several variables is meant as a text, 
either for one semester following the First Course in Calculus, or for a 
longer period if the calculus sequence is so structured. 

In a one-semester course, I suggest covering most of the first part, 
omitting Chapter II, §3 and omitting some material from the chapter on 
Taylor’s formula in several variables, to suit the taste of the instructor and 
the class. One can then jump directly to the chapter on double and triple 
integrals, which could in fact be treated immediately after Chapter I. If 
time allows, one can also cover the first section in the chapter on Green’s 
theorem,, which gives a neat application of the techniques of double inte¬ 
grals and curve integrals. ' Joining them in this fashion will make the 
student learn both techniques better for having used them in a significant 
context. 

The first part has considerable unity of style. Essentially all the results 
are immediately corollaries of the chain rule. The main idea is that given 
a function of several variables, if we want to look at its values at two 
points P and Q, we join these points by a curve (often a straight line), 
and then look at the values of the function on that curve. By this device, 
we are able to reduce a large number of problems in several variables to 
problems and techniques in one variable. For instance, the directional 
derivative, the law of conservation of energy, and Taylor’s formula, are 
handled in this manner. 

I have included only that part of linear algebra which is immediately 
useful for the applications to calculus. My Introduction to Linear Algebra 
provides an appropriate text when a whole semester is devoted to the 
subject. Many courses are still structured to give primary emphasis to 
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the analytic aspects, and only a few notions involving matrices and linear 
maps are needed to cover, say, the chain rule for mappings of one space 
into another, and to emphasize the importance of linear approximations. 

The last chapter on surface integrals and Stokes’ theorem could essen¬ 
tially be covered after Green’s theorem and multiple integrals. The chap¬ 
ter on the change of variables formula in multiple integration is the most 
expendable one, and can be omitted altogether without affecting the under¬ 
standing of the rest of the book. Each instructor will adapt the material 
to the needs of any given class. 


New Haven, Connecticut 
November 1972 


Serge Lang 
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PART ONE 

MAPPINGS FROM NUMBERS 
TO VECTORS AND 
VECTORS TO NUMBERS 



In dealing with higher dimensional space, we can often reduce certain 
problems to 1-dimensional ones by using the following idea. We can join 
two points in space by a line segment. If we have a function defined in 
some region in space containing the points, and we want to analyze the 
behavior of the function at these points, then we can look at the induced 
function on the line segment. This yields a function of one variable. 

Dealing with a segment between two points amounts to dealing with a 
mapping from numbers to higher dimensional space, parametrizing the 
segment. On the other hand, a function defined on a region in space 
takes on values in the real numbers. These two cases are important in 
themselves, and are also used later in the general situation where we 
consider mappings from one space into another. 



CHAPTER I 


Vectors 


The concept of a vector is basic for the study of functions of several 
variables. It provides geometric motivation for everything that follows. 
Hence the properties of vectors, both algebraic and geometric, will be 
discussed in full. 

One significant feature of all the statements and proofs of this part is 
that they are neither easier nor harder to prove in 3- or n-space than they 
are in 2-space. 


§1. Definition of points in n-space 

We know that a number can be used to represent a point on a line, 
once a unit length is selected. 

A pair of numbers (i.e. a couple of numbers) (x, y) can be used to 
represent a point in the plane. 

These can be pictured as follows: 
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(a) Point on a line (b) Point in a plane 


Figure 1 


We now observe that a triple of numbers (x, y, z) can be used to repre¬ 
sent a point in space, that is 3-dimensional space, or 3-space. We simply 
introduce one more axis. 

The picture on the next page illustrates this. 
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z-axis 



Figure 2 


2/-axis 


Instead of using x, y, z we could also use (*i, x 2 , x 3 ). The line could 
be called 1-space, and the plane could be called 2-space. 

Thus we can say that a single number represents a point in 1-space. A 
couple represents a point in 2-space. A triple represents a point in 3-space. 

Although we cannot draw a picture to go further, there is nothing to 
prevent us from considering a quadruple of numbers 

Oi, x 2 , x 3 , x 4 ) 

and decreeing that this is a point in 4-space. A quintuple would be a 
point in 5-space, then would come a sextuple, septuple, octuple, .... 

We let ourselves be carried away and define a point in n-space to be an 
n-tuple of numbers 


(^1, X 2 , • • • 5 x n ), 

if n is a positive integer. We shall denote such an n-tuple by a capital 
letter X, and try to keep small letters for numbers and capital letters for 
points. We call the numbers x\, ... ,x n the coordinates of the point X. 
For example, in 3-space, 2 is the first coordinate of the point (2, 3, —4), 
and —4 is its third coordinate. 

Most of our examples will take place when n = 2 or n = 3. Thus the 
reader may visualize either of these two cases throughout the book. How¬ 
ever, two comments must be made: First, practically no formula or 
theorem is simpler by making such assumptions on n. Second, the case 
n = 4 does occur in physics, and the case n = n occurs often enough 
in practice or theory to warrant its treatment here. Furthermore, part 
of our purpose is in fact to show that the general case is always similar 
to the case when n = 2 or n = 3. 
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Examples. One classical example of 3-space is of course the space we 
live in. After we have selected an origin and a coordinate system, we can 
describe the position of a point (body, particle, etc.) by 3 coordinates. 
Furthermore, as was known long ago, it is convenient to extend this space 
to a 4-dimensional space, with the fourth coordinate as time, the time 
origin being selected, say, as the birth of Christ—although this is purely 
arbitrary (it might be more convenient to select the birth of the solar 
system, or the birth of the earth as the origin, if we could determine these 
accurately). Then a point with negative time coordinate is a BC point, 
and a point with positive time coordinate is an AD point. 

Don’t get the idea that “time is the fourth dimension”, however. The 
above 4-dimensional space is only one possible example. In economics, 
for instance, one uses a very different space, taking for coordinates, say, 
the number of dollars expended in an industry. For instance, we could 
deal with a 7-dimensional space with coordinates corresponding to the 
following industries: 

1. Steel 2. Auto 3. Farm products 4. Fish 

5. Chemicals 6. Clothing 7. Transportation 

We agree that a megabuck per year is the unit of measurement. Then a 
point 

(1,000, 800, 550, 300, 700, 200, 900) 

in this 7-space would mean that the steel industry spent one billion dollars 
in the given year, and that the chemical industry spent 700 million 
dollars in that year. 

We shall now define how to add points. If A, B are two points, say 
A (a i,. . ., ci n ), B 0i, . . ., bn), 

then we define A + B to be the point whose coordinates are 

(a i -f- bi, . .., a n + b n ). 

For example, in the plane, if A = (1,2) and B — (—3, 5), then 

A + B = (-2,7). 

In 3-space, if ^4 = (— 1, it, 3) and B = (v/2, 7, —2), then 

A + B = (y/2 - 1,7T + 7, 1). 

Furthermore, if c is any number, we define cA to be the point whose 
coordinates are 

(cat, . . ., ca n ). 

If A = (2, — 1, 5) and c = 7, then cA = (14, —7, 35). 
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We observe that the following rules are satisfied: 

(1) (A + B) + C = A + (B + Q. 

(2) A + B = B + A. 

(3) c(A + B) = cA + cB. 

(4) If ci, c 2 are numbers, then 

(£T + c 2 )A = CiA + c 2 A and ( CiC 2 )A = Ci(c 2 A). 

(5) If we let O = (0, . . . , 0) be the point all of whose coordinates are 0, 
then O + A = A + O = A for all A. 

(6) 1 ■ A = A, and if we denote by —A the n-tuple (— 1 )A, then 

A + (-A) = O. 

[Instead of writing A + ( —B ), we shall frequently write A — B.] 

All these properties are very simple to prove, and we suggest that you 
verify them on some examples. We shall give in detail the proof of prop¬ 
erty (3). Let A = (a lf .. ., a n ) and B = (b i, . . . , b n ). Then 

A + B = (a i + b i, . . . , a n -f- b n ) 

and 

c(A + B) = (c(a x 6i), . . ., c{a n + b n )) 

(r^i “I - cb i, . . ., cci n “f - r/? w ) 

= cA + cB, 


this last step being true by definition of addition of «-tuples. 

The other proofs are left as exercises. 

Note. Do not confuse the number 0 and the n-tuple (0, . . . , 0). We 
usually denote this n-tuple by O, and also call it zero, because no difficulty 
can occur in practice. 

We shall now interpret addition and multiplication by numbers geo¬ 
metrically in the plane (you can visualize simultaneously what happens 
in 3-space). 

Take an example. Let A = (2, 3) and B = (—1,1). Then 

A + B = (1,4). 

The figure looks like a parallelogram (Fig. 3). 


(1,4) 



Figure 3 
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Take another example. Let A = (3, 1) and B = (1, 2). Then 

A + B = (4, 3). 

We see again that the geometric representation of our addition looks like 
a parallelogram (Fig. 4). 



The reason why the figure looks like a parallelogram can be given in 
terms of plane geometry as follows. We obtain B = (1, 2) by starting 
from the origin O = (0, 0), and moving 1 unit to the right and 2 up. 
To get A + B, we start from A, and again move 1 unit to the right and 
2 up. Thus the line segments between O and B, and between A and 
A + B are the hypotenuses of right triangles whose corresponding legs 
are of the same length, and parallel. The above segments are therefore 
parallel and of the same length, as illustrated on the following figure. 



What is the representation of multiplication by a number? Let 
A = (1, 2) and c - 3. Then cA = (3, 6) as in Fig. 5(a). 

Multiplication by 3 amounts to stretching A by 3. Similarly, \A 
amounts to stretching A by \, i.e. shrinking A to half its size. In general, 
if Ms a number, t > 0, we interpret tA as a point in.the same direction 
as A from the origin, but t times the distance. 

Multiplication by a negative number reverses the direction! Thus —3 A 
would be represented as in Fig. 5(b). 
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(b) Figure 5 


Exercises 


Find A + 5, A — 5, 3A, — 25 in each of the following cases. Draw the points 
of Exercises 1 and 2 on a sheet of graph paper. 

1. A = (2, -1 ),5 = (-1,1) 2. A — (-1,3),5 = (0,4) 

3. A = (2, -1,5), 5 = (-1,1,1) 4. A = (-1, -2, 3), 5 = (-1,3, -4) 

5. A — (tt, 3, -1), 5 = (2tt, -3, 7) 6. A = (15, -2, 4), 5 = (tt, 3, -1) 

7. Let A = (1, 2) and B = (3,1). Draw A + B, A + 25, + 35, A — B, 

A — 25, v4 — 35 on a sheet of graph paper. 

8. Let /l, 5 be as in Exercise 1. Draw the points A + 25, A + 35, A — 25, 
^4 — 35, A + %B on a sheet of graph paper. 

9. Let A and 5 be as drawn in the following figures. Draw the point A — 5. 
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Located vectors 


We define a located vector to be an ordered pair of points which we 
write AB. (This is not a product.) We visualize this as an arrow between 
A and B. We call A the beginning point and B the end point of the located 
vector (Fig. 6). 



Figure 6 


How are the coordinates of B obtained from those of A? 
that in the plane, 

bi = a x + (b i - ai). 

Similarly, 


This means that 


b 2 — a 2 -\- (b 2 — a 2 ). 


B = A + (B - A). 


We observe 


Let AB and CD be two located vectors. We shall say that they are 
equivalent if B — A = D — C. Every located vector AB is equivalent 
to one wh ose beginning point is the origin, because AB is equivalent to 
0(B — A). Clearly this is the only located vector whose beginning point 
is the origin and which is equivalent to AB. If you visualize the parallelo¬ 
gram law in the plane, then it is clear that equivalence of two located 
vectors can be interpreted geometrically by saying that the lengths of the 
line segments determined by the pair of points are equal, and that the 
“directions” in which they point are the same. 

In the next figures, we have drawn the located vectors 0(B — A), AB, 
and Q(A — B), BA. 



Figure 7 


Figure 8 
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Given a located vector OC whose beginning point is the origin, we shall 
say that it is located at the origin. Given any located vector AB, we shall 
say that it is located at A. 

A located vector at the origin is entirely determined by its end point. 
In view of this, we shall call an n-tuple either a point or a vector, depend¬ 
ing on the interpretation which we have in mind. 

Two located vectors AB and PQ are said to be parallel if there is a 
number c / 0 such that B — A = c(Q — P). They are said to have the 
same direction if there is a number c > 0 such that B — A = c(Q — P), 
and to have opposite direction if there is a number c < 0 such that 
B — A = c(Q — P). In the next pictures, we illustrate parallel located 
vectors. 



(a) Same direction 



Figure 9 

In a similar manner, any definition made concerning n-tuples can be 
carried over to located vectors. For instance, in the next section, we shall 
define what it means for /z-tuples to be perpendicular. Then we can say 
that two located vectors AB and PQ are perpendicular if B — A is per¬ 
pendicular to Q — P. In the next figure, we have drawn a picture of such 
vectors in the plane. 



Figure 10 
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Example 1. LetP = (1, —1,3) and Q = (2,4, 1). Then PQ is equiva¬ 
lent to OC, where C = Q — P = (1, 5, — 2). If A = (4, —2, 5) and 
B = (5, 3, 3), then PQ is equivalent to AB because 

Q - P = B - A = (1,5, -2). 

Example 2. Let P = (3,7) and Q = (—4,2). Let A = (5, 1) and 
B = (-16, -14). Then 

Q - P = (-7, -5) and B - A = (-21, -15). 

Hence PQ is parallel to AB, because B — A = 3 (Q — P). Since 3 > 0, 
we even see that PQ and AB have the same direction. 

Exercises 


In each case, determine which located vectors PQ and AB are equivalent. 

1. P = (1, -1), Q = (4, 3), A = (-1, 5), B = (5, 2). 

2. P = (1, 4), (2 = (-3, 5), A = (5, 7), B = (1, 8). 

3. /> = (1, -1, 5), Q = (-2, 3, -4), A = (3, 1, 1), 5 = (0, 5, 10). 

4. /> = (2, 3, -4), <2 = (-1, 3, 5), A = (-2, 3, -1), B = (-5, 3, 8). 

In each case, determine which located vectors PQ and AB are parallel. 

5. P = (1, -1), Q = (4, 3), >1 = (-1, 5), B = (7, 1). 

6. P = (1, 4), Q = (-3, 5), yl - (5, 7), B = (9, 6). 

1- P = (1, -1,5), <2 = (-2,3, —4), A = (3, 1, 1), B — (-3,9, -17). 

8. P = (2,3, -4), Q = (-1,3,5 ), A = (-2, 3, -1),2? - (-11,3, -28). 

9. Draw the located vectors of Exercises 1, 2, 5, and 6 on a sheet of paper to 
illustrate these exercises. Also draw the located vectors QP and BA. Draw 
the points Q — P, B — A, P — Q, and A — B. 

§5. Scalar product 

It is understood that throughout a discussion we select vectors always 
in the same ^-dimensional space. 

Let A = (ai, . . ., a n ) and B = (b x , ... ,b n ) be two vectors. We 
define their scalar or dot product A • B to be 

~f~ ' ' ' + a n b n . 

This product is a number. For instance, if 

^ = (l, 3, —2) and ^ = (—1,4, —3) 
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then 

A • B = -1 + 12 + 6 = 17. 

For the moment, we do not give a geometric interpretation to this scalar 
product. We shall do this later. We derive first some important prop¬ 
erties. The basic ones are: 

SP 1. We have A • B = B A. 

SP 2. If A, B, C are three vectors then 

A • (B + C) = A ■ B + A ■ C = (5 + C) • A. 

SP 3. If x is a number, then 

( xA ) • B = x(A • B) and A ■ (xl3) = x(A ■ B). 

SP 4. If A = O is the zero vector, then A • A — 0, and otherwise 
A • A > 0. 

We shall now prove these properties. 

Concerning the first, we have 

aibi + • • • + a n b n = b\a x • * * + b n a n , 

because for any two numbers a, b, we have ab = ba. This proves the 
first property. 

For SP 2, let C = (c x ,, c n ). Then 

B + C = (7>i + Ci, . . ., b n + c n ) 
and 

A • (B + C) = d\(f>\ + Ci) + • • • + a n (b n + c n ) 

= a \b\ + ctiCi -+-’■■ + a n b n + a n c n . 

Reordering the terms yields 

a \b\ + • • • + a n b n + d\C\ + • • • + <*nC n , 

which is none other than A • B + A • C. This proves what we wanted. 
We leave property SP 3 as an exercise. 

Finally, for SP 4, we observe that if one coordinate ai of A is not equal 
to 0, then there is a term af ^ 0 and af > 0 in the scalar product 

A • A = a\ + • • • + a^. 

Since every term is ^ 0, it follows that the sum is > 0, as was to be 
shown. 

In much of the work which we shall do concerning vectors, we shall use 
only the ordinary properties of addition, multiplication by numbers, and 
the four properties of the scalar product. We shall give a formal discussion 



U, §3] 


SCALAR PRODUCT 


13 


of these later. For the moment, observe that there are other objects with 
which you are familiar and which can be added, subtracted, and multiplied 
by numbers, for instance the continuous functions on an interval [a, b] 
(cf. Exercise 6). 

Instead of writing A • A for the scalar product of a vector with itself, 
it will be convenient to write also A 2 . (This is the only instance when we 

allow ourselves such a notation. Thus A 3 has no meaning.) As an exer¬ 

cise, verify the following identities: 

(A + B) 2 = A 2 + 2A • B + B 2 , 

(A - B) 2 = A 2 - 2A ■ B + B 2 . 

A dot product A • B may very well be equal to 0 without either A or B 
being the zero vector. For instance, let A = (1, 2, 3) and B = (2, 1, — §). 
Then A • B = 0. 

We define two vectors A, B to be perpendicular (or as we shall also say, 
orthogonal) if A • B = 0. For the moment, it is not clear that in the 
plane, this definition coincides with our intuitive geometric notion of 
perpendicularity. We shall convince you that it does in the next section. 
Here we merely note an example. Say in R 3 , let 


E i = (1, 0, 0), E 2 - (0, 1, 0), E s = (0, 0, 1) 


be the three unit vectors, as shown on the diagram (Fig. 11). 


Z 



Then we see that E\ • E 2 = 0, and similarly £,• • E, — 0 if i j. 
And these vectors look perpendicular. If A = (a ls a 2 , a 3 ), then we observe 
that the /-th component of A , namely 


a: = A ■ Ei 
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is the dot product of A with the i- th unit vector. We see that A is per¬ 
pendicular to Ei (according to our definition of perpendicularity with the 
dot product) if and only if its /-th component is equal to 0. 

Exercises 

1. Find A ■ A for each of the following n-tuples. 

(a) A = (2, — 1), B = (-1,1) (b) A = (-1, 3), B = (0,4) 

(c) A = (2,-1,5), B = (-1,1,1) (d) A = (-1,-2,3),5= (-1,3,-4) 

(e) A = (ir, 3, -1), B = (2i r, -3,7) (f) A = (15, -2, 4), B = (tt, 3, - 1) 

2. Find A ■ B for each of the above n-tuples. 

3. Using only the four properties of the scalar product, verify in detail the 
identities given in the text for (A -f- B) 2 and (A — B) 2 . 

4. Which of the following pairs of vectors are perpendicular? 

(a) (1, -1,1) and (2, 1,5) -(b) (1, -1, 1) and (2, 3, 1) 

(c) (-5, 2, 7) and (3, -1, 2) ^ (d) (tt, 2, 1) and (2, -tt, 0) 

5. Let A be a vector perpendicular to every vector X. Show that A = O. 
Scalar product for functions. 

6. Consider continuous functions on the interval [—1, 1]. Define the scalar 
product of two such functions /, g to be 

f ^ f(x)g(x) dx. 

We denote this integral also by (f g ). Verify that the four rules for a scalar 
product are satisfied, in other words, show that: 

SP 1. (f,g) = (g,f). 

SP 2. </, g + h) = </, g) + (/, h). 

SP 3. (c/,g) = c(f, g). 

SP 4. Iff = 0 then (ff) = 0 and iff ^ 0 then < f,f) > 0. 

7. If/(.x) = x and g(x) = x 2 , what are (/,/), (g, g), and (f g)? 

8. Consider continuous functions on the interval [— ir, t]. Define a scalar 
product similar to the above for this interval. Show that the functions 
sin nx and cos mx are orthogonal for this scalar product ( m , n being integers). 

§4. The norm of a vector 

We define the norm, or length, of a vector A, and denote by ||/4||, the 
number 

Mil = \Ta • a. 

Since A • A ^ 0, we can take the square root. 
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In terms of coordinates, we see that 

Mil = + .. - + al 

and therefore that when n = 2 or n = 3, this coincides with our intuitive 
notion (derived from the Pythagoras theorem) of length. Indeed, when 
n = 2 and say A = (a, b ), then the norm of A is 

ll^ll = \/ a 2 + b 2 , 

as in the following picture. 



For example, if A = (1, 2), then 

\\A || = Vl + 4 = \/5. 

If B = (—1, 2, 3), then 

||£|| = Vl + 4 + 9 = Vl4. 

If n = 3, then the picture looks like Fig. 13, with A = (x, y, z). 



Figure 13 
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If we first look at the two components (x, y ), then the length of the 
segment between (0, 0) and (x, y ) is equal tow = V x 2 + y 2 , as indicated. 

Then again the length of A by the Pythagoras theorem would be 
y/w 2 + z 2 = -\/x 2 + y 2 + z 2 . 


Thus when n = 3, our definition of length is compatible with the geometry 
of the Pythagoras theorem. 

If A = (ai, , a n ) and A O, then ||^|| ^ 0 because some coor¬ 
dinate cii 0, so that a\ > 0, and hence a\ + • • • + a„ > 0, so ||^|| 9* 0. 
Observe that for any vector A we have 

I Mil = 11-41. 

This is due to the fact that 

(—fli ) 2 + • • * + (—«n ) 2 = a\ + • • * + On, 
because (— l) 2 = 1. Of course, this is as it should be from the picture: 



Figure 14 


From the geometry of the situation, it is also reasonable to expect that 
if c > 0, then ||c/!|| .= c||4|, i e. if we stretch a vector A by multiplying 
by a positive number c, then the length stretches also by that amount. 
We verify this formally using our definition of the length. 

Theorem 1. Let x be a number. Then 

IMI = Ml I Mil 

((absolute value of x times the length of A). 

Proof. By definition, we have 

11 x^411 2 = (xA) • (xA), 


which is equal to 


x 2 (A • A) 
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by the properties of the scalar product. Taking the square root now yields 
what we want. 


We shall say that a vector E is a unit vector if \\E\\ = 1. Given any 
vector A, let a = |jv4||. If a ^ 0 then 


is a unit vector, because 



a 




a 


We shall say that two vectors A, B (neither of which is O ) have the 
same direction if there is a number c > 0 such that cA = B. In view of 
this definition, we see that the vector 



is a unit vector in the direction of A (provided A O ). 


A 


Figure 15 

If E is the unit vector in the direction of A, and ||/t|| = a, then 

A = aE. 

Example 1. Let A = (1, 2, —3). Then ||/4|| = \/J4. Hence the unit 
vector in the direction of A is the vector 

E=(-L.. IV 

Wl4 \/l4 V14/ 

We mention in passing that two vectors A , B (neither of which is O ) 
have opposite directions if there is a number c < 0 such that cA = B. 

Let A, B be two n-tuples. We define the distance between A and B 
to be 

||^ - B\\ = V(A - B) -(A - B). 

This definition coincides with our geometric intuition when A, B are 
points in the plane (Fig. 16). It is the same thing as the length of the 
located vector AB or the located vector BA. 




18 


VECTORS 


[I, §4] 



Figure 16 


Example 2. Let A = (— 1, 2) and B = (3, 4). Then the length of the 
located vector AB is \\B — A\\. But B — A = ( 4, 2). Thus 

||B - ^|| = Vl6 + 4 = \/20. 

In the picture, we see that the horizontal side has length 4 and the vertical 
side has length 2. Thus our definitions reflect our geometric intuition 
derived from Pythagoras. 



Figure 17 


We are also in the position to justify our definition of perpendicularity. 
Given A, B in the plane, the condition that 

\\A + B\\ = ||A - J?|| 

(illustrated in Fig. 18(b)) coincides with the geometric property that A 
should be perpendicular to 13. 


A 
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Taking the square of each side, we see that this condition is equivalent 
with 

(A -}- fi) • (A + fi) = (A — fi) • (A — fi) 

and expanding out, this equality is equivalent with 

A-A+ 2A B + BB = A-A - 2A • fi -}- fi • B. 

Making cancellations, we obtain the equivalent condition 

A A • B = 0 
or 

A - B = 0. 

This achieves what we wanted to show, namely that 

\\A — J3|| = || A + fi|| if and only if A ■ B = 0. 

Observe that we have the general Pythagoras theorem; If A, B are 
perpendicular , then 

IM + fi|| 2 = Mil 2 + ||fi|| 2 . 

The theorem is illustrated on Fig. 19. 



Figure 19 

To prove this, we use the definitions, namely 

|| A + fi|| 2 = (A + fi) • (A + fi) = A 2 + 2A • fi + fi 2 

= Mil 2 + MU 2 , 

because A • B = 0, and A • A = M!l 2 > B • B = ||fi|| 2 by definition. 

Remark. If A is perpendicular to fi, and x is any number, then A is 
also perpendicular to xB because 


A ■ xB = xA • B = 0. 
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We shall now use the notion of perpendicularity to derive the notion 
of projection. Let A, B be two vectors and B 9 ^ O. We wish to define the 
projection of A along B, which will be a vector P as shown in the picture. 



We seek a vector P such that A — P is perpendicular to B, and such that 
P can be written in the form P — cB for some number c. Suppose that 
we can find such a number c, namely one satisfying 

(A - cB) • B = 0. 

A • B = cB • B, 

A • B 
c =-• 

BB 

We see that such a number c is uniquely determined by our condition of 
perpendicularity. Conversely, if we let c have the above value, then we 
have 

(A - cB)‘B = A - B - cB- B = 0. 


We then obtain 
and therefore 


Thus this value of c satisfies our requirement. 

We now define the vector cB to be the projection of A along B, if c is 
the number 

A • B 

C= B~B' 

and we define c to be the component of A along B. If B is a unit vector, 
then we have simply 

c = A • B. 

Example. Let A = (1,2, —3) and B — (1, 1,2). Then the com¬ 
ponent of A along B is the number 

A • B -3 1 

C ~ B-B 6 2 ‘ 


Hence the projection of A along B is the vector 

cB= (-i, -i-1). 
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Our construction has an immediate interpretation in the plane, which 
gives us a geometric interpretation for the scalar product. Namely, 
assume A 9 * O and look at the angle 0 between A and B (Fig. 21). Then 
from plane geometry we see that 


cos 0 = 





or substituting the value for c obtained above, 


A • B = |[/4|| ||2?|| cos 0. 



In some treatments of vectors, one takes the relation 

A • B = ||/4|| ||2?|| cos 0 

as definition of the scalar product. This is subject to the following dis¬ 
advantages, not to say objections: 

* 

(a) The four properties of the scalar product SP 1 through SP 4 are then 
by no means obvious. 

(b) Even in 3-space, one has to rely on geometric intuition to obtain the 
cosine of the angle between A and B, and this intuition is less clear than 
in the plane. In higher dimensional space, it fails even more. 

(c) It is extremely hard to work with such a definition to obtain further 
properties of the scalar product. 

Thus we prefer to lay obvious algebraic foundations, and then recover 
very simply all the properties. Aside from that, in analysis, one uses 
scalar products in the context of functions, where cos 0 becomes com¬ 
pletely meaningless, for instance in Exercise 5 of §3, which is the starting 
point of the theory of Fourier series. 

We shall prove further properties of the norm and scalar product using 
our results on perpendicularity. First note a special case. If 

Ei = ( 0 , . . ., 0 , 1 , 0 , . . . , 0 ) 

is the z-th unit vector R”, and 

A (zzi, • • •, fin), 
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is the z'-th component of A, i.e. the component of A along E,-. We have 

Wi\ = V a 2 i ^ Va? + • • • + = Mil, 

so that the absolute value of each component of A is at most equal to the 
length of A. 

We don’t have to deal only with the special unit vector as above. Let 
E be any unit vector, that is a vector of length 1. Let c be the component 
of A along E. We saw that 

c = A- E. 

Then A — cE is perpendicular to E, and 

A = A — cE + cE. 


Then A — cE is also perpendicular to cE, and by the Pythagoras theorem, 
we find 

MU 2 = M - cE \\ 2 + Ik^ll 2 = M - cE \\ 2 + c2 - 

o 

Thus we have the inequality c 2 ^ Mll 2 > an< ^ 

M ^ Mil- 


In the next theorem, we generalize this inequality to a dot product 
A • B when B is not necessarily a unit vector. 

Theorem 2. Let A, B be two vectors in R n . Then 

\A-B\i \\A\\ ||JS||. 


Proof. If B = O, then botlv sides of the inequality are equal to 0, and 
so our assertion is obvious. Suppose that B O. Let E be the unit vector 
in the direction of B, so that 



We use the result just derived, namely \A • E\ f? Mil* an ^ 


M • B\ < 

11*11 = 


| A 


Multiplying by ||2?|| yields the proof of our theorem. 
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In view of Theorem 2, we see that for vectors A, B in w-space, the number 

A • B 

\\A\\ ||£|| 

has absolute value ^ 1. Consequently, 

* 

-1 < - M ’ j L- < 1 

“ \\A\\ ||£|| - ’ 

and there exists a unique angle 9 such that 0 ^ 6 ^ tt, and such that 

A-B 

cos 9 = Mpf' 

We define this angle to be the angle between A and B. 


Example. Let A = (1, 2, —3) and B = (2, 1, 5). Find the cosine of 
the angle 9 between A and B. 

By definition, we must have 

„ A'B 2 + 2-15 -11 

COS 9 = -rr ... || - —- = = • 

Mil Mil V14V30 x/420 

The inequality of Theorem 2 is known as the Schwarz inequality. 


Theorem 3. Let A, B be vectors. Then 

\\A + B\\ ^ MU + Mil. 

Proof. Both sides of this inequality are positive or 0. Hence it will 
suffice to prove that their squares satisfy the desired inequality, in other 
words, 

(A + B)-(A + B)H (M + HSU) 2 . 

To do this, we consider 

(A + B) • (A + B) = A • A + 2A • B + B • B. 

In view of our previous result, this satisfies the inequality 

g Mil 2 + 2 MII Mil + Mll 2 > 

and the right-hand side is none other than 

(Mil + Mil) 2 - 

Our theorem is proved. 

Theorem 3 is known as the triangle inequality. The reason for this is 
that if we draw a triangle as in Fig. 22, then Theorem 3 expresses the fact 
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that the length of one side is ^ the sum of the lengths of the other two 
sides. 


A+B 



Figure 22 


1. Find the length of the vector A in the following cases. 

(a) A = (2, -\),B = (-1,1) (b) A = (-1,3),5 = (0,4) 

(c) A = (2, —1,5), B = (-1,1,1) (d) A = (-1, -2, 3), B = (-1, 3, -4) 

(e) A = (tt, 3, -1), B = (2tt, —3,7) (f) >4 = (15, — 2,4), B = (tt, 3, -1) 

2. Find the length of vector B in the above cases. 

3. Find the projection of A along B in the above cases. 

4. Find the projection of B along A in the above cases. 

5. Determine the cosine of the angles of the triangle whose vertices are 

(a) (2, -1,1), (1, -3, -5), (3, -4, -4). 

(b) (3,1,1), (-1,2,1), (2,-2, 5). 

6. Let A i,..., Af be non-zero vectors which are mutually perpendicular, in 
other words Ai • Aj = 0 if i j. Let c\,. .., c T be numbers such that 

ciAi + • • • + c t A t = 0. 


Show that all a = 0. 

7. If A, B are two vectors in n-space, denote by d(A, B ) the distance between 
A and B , that is d(A, B) = ||fi — A\\. Show that 

d(A, B) = d(B, A), 

and that for any three vectors A, B, C we have 

d(A, B) ^ d(A, C ) + d(B, C). 

8. For any vectors A, B in /i-space, prove the following relations: 

(a) M + *ll 2 + IM - fill 2 - 2IMII 2 + 2||fi|| 2 . 

(b) ||<4 + fill 2 = Mil 2 + l|fi || 2 +2A B. 

(c) M + fill 2 - M - fill 2 - 4A ■ B. 

Interpret (a) as a “parallelogram law’’. 
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9. Show that if d is the angle between A and fi, then 

\\A - fi|| 2 = ||/t|| 2 + ||fi|| 2 - 2\\A\\ ||fi|| cos 0. 

10. Let A, B, C be three non-zero vectors. If A ■ B = A • C, show by an ex¬ 
ample that we do not necessarily have fi = C. 

11. Let A, B be non-zero vectors, mutually perpendicular. Show that for any 
number c we have ||A + cfi|| ^ ||/4||. 

12. Let A, B be non-zero vectors. Assume that ||A + cfi|| ^ ||/1|| for all num¬ 
bers c. Show that A, B are perpendicular. 

13. Let f(x ) = x and g(x) = x 2 . Using the scalar product 

(f,g) = J f(x)g(x) dx, 

find the projection of /along g and the projection of g along /, using the 
same definition of projection that has been given in the text, and did not 
refer to coordinates. 

14. For this same scalar product, the norm of a function /is Find 

the norm of the constant function 1. 

15. Consider now functions on the interval [— r, ir]. Define the scalar product by 

(f,g) = f f(x)g(x)dx. 

J —TV 

Find the norm of the functions sin 3x and cos x. 

16. Find the norm of the constant function 1 for the scalar product of Exer¬ 
cise 15. 

17. In general, find the norm of the functions sin /I* and cos mx, where m, n 
are positive integers. 


§5. Lines and planes 

We define the parametric equation of a straight line passing through 
a point P in the direction of a vector A O to be 

X = P + tA, 

where t runs through all numbers (Fig. 23). 


Figure 23 
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Suppose that we work in the plane, and write the coordinates of a 
point X as (x, y). Let P = ( p , q) and A = (a, b ). Then in terms of the 
coordinates, we can write 

x = p + ta, y = q + tb. 

We can then eliminate t and obtain the usual equation relating x and y. 

For example, let P = (2, 1) and A = (— 1, 5). Then the parametric 
equation of the line through P in the direction of A gives us 

(*) x = 2 — t, y = 1 + 5/. 

Multiplying the first equation by 5 and adding yields 

(**) 5 x + y = 11, 

which is familiar. 

This elimination of t shows that every pair (x, y) which satisfies the 
parametric equation (*) for some value of t also satisfies equation (**). 
Conversely, suppose we have a pair of numbers (x, y) satisfying (**). Let 
t = 2 — x. Then 

y = 11 - 5x = 11 - 5(2 - t) = 1 + 5 1 . 

Hence there exists some value of t which satisfies equation (*). Thus we 
have proved that the pairs (x,y) which are solutions of (**) are exactly 
the same pairs of numbers as those obtained by giving arbitrary values 
for t in (*). Thus the straight line can be described parametrically as in 
(*) or in terms of its usual equation (**). Starting with the ordinary equa¬ 
tion 

5x + y = 11, 

we let / = 2 — x in order to recover the specific parametrization of (*). 
When we parametrize a straight line in the form 

X — P -\r tA, 

we have of course infinitely many choices for P on the line, and also 
infinitely many choices for A, differing by a scalar multiple. We can 
always select at least one. Namely, given an equation 

ax + by = c 

with numbers a, b, c, suppose that a 0. We use y as parameter, and let 

y = t. 

Then we can solve for x, namely 
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Let P = (c/a, 0) and A = ( — b/a, 1). We see that an arbitrary point 
(x, y) satisfying the equation 


ax + by = c 

can be expressed parametrically, namely 

(x, y) = P + tA. 

In higher dimension, starting with a parametric equation 

X = P + tA, 

we cannot eliminate t, and thus the parametric equation is the only one 
available to describe a straight line. 

However, we can describe planes by an equation analogous to the single 
equation of the line. We proceed as follows. 


z 



Figure 24 


Let P be a point in 3-space and consider a located vector ON. We define 
the plane passing through P perpendicular to ON to be the collection 
of all points X such that the located vector PX is perpendicular to ON. 
According to our definitions, this amounts to the condition 

(X - P) • N = 0, 

which can also be written as 

X-N = P-N. 

We shall also say that this plane is the one perpendicular to N, and con¬ 
sists of all vectors X such that X — P is perpendicular to N. We have 
drawn a typical situation in 3-space in Fig. 24. 
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Instead of saying that N is perpendicular to the plane, one also says 
that N is normal to the plane. 

Let t be a number 5 ^ 0. Then the set of points X such that 

(X - P) ■ N = 0 

coincides with the set of points X such that 

(X - P)- tN = 0. 

Thus we may say that our plane is the plane passing through P and per¬ 
pendicular to the line in the direction of N. To find the equation of the 
plane, we could use any vector tN (with t 7^ 0) instead of N. 

In 3-space, we get an ordinary plane. For example, let P = (2, 1, — 1) 
and N — (—1, 1, 3). Then the equation of the plane passing through P 
and perpendicular to N is 

— x T" y -f" 3z = —2 —f— 1 — 3 
or 

— x + y + 3z = —4. 

Observe that in 2-space, with X = (x, y), the formulas lead to the 
equation of the line in the ordinary sense. For example, the equation of 
the line passing through (4, —3) and perpendicular to (—5, 2) is 

— 5x + 2 y = -20 - 6 = -26. 

We are now in position to interpret the coefficients ( — 5, 2) of x and y 
in this equation. They give rise to a vector perpendicular to the line. In 
any equation 

ax + by = c 

the vector (a , b) is perpendicular to the line determined by the equation. 
Similarly, in 3-space, the vector (a, b, c ) is perpendicular to the plane 
determined by the equation 

ax + by -f- cz = d. 

For example, the plane determined by the equation 

2x — y + 3z = 5 

is perpendicular to the vector (2, — 1, 3). If we want to find a point in 
that plane, we of course have many choices. We can give arbitrary values 
to x and y, and then solve for z. To get a concrete point, let x = 1 ,y= 1. 
Then we solve for z, namely 

3z = 5 - 2 + 1 = 4, 

so that z = f. Thus 


is a point in the plane. 


0 , 1 , |) 
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In n-space, the equation X • N = P ■ N is said to be the equation of a 
hyperplane. For example, 

3x — j + z + 2vv = 5 


is the equation of a hyperplane in 4-space, perpendicular to (3, — 1, 1, 2). 


Two vectors A, B are said to be parallel if there exists a number c 0 
such that cA = B. Two lines are said to be parallel if, given two distinct 
points Pi, Q i on the first line and P 2 , Q 2 on the second, the vectors 


and 

are parallel. 


Pi - Q 1 
P 2 — Q2 


Two planes are said to be parallel (in 3-space) if their normal vectors 
are parallel. They are said to be perpendicular if their normal vectors are 
perpendicular. The angle between two planes is defined to be the angle 
between their normal vectors. 


Example 1. Find the cosine of the angle between the planes 

lx — y + z = 0 , 
a + ly — z = 1. 

This cosine is the cosine of the angle between the vectors 
A = (2, -1, 1) and B = (1, 2, -1). 

It is therefore equal to 

A- B 1 

\A\\ ||£|| “ 6 ' 


Example 2. Let 

Q= (U,l) and P=(l, -1,2). 

Let 

N= (1,2, 3). 

Find the point of intersection of the line through P in the direction of N, 
and the plane through Q perpendicular to N. 

The parametric equation of the line through P in the direction of N is 

(1) X = P + tN. 

The equation of the plane through Q perpendicular to N is 

(2) (X - 0 • N = 0. 
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We visualize the line and plane as follows: 



We must find the value of t such that the vector X in (1) also satisfies (2), 
that 1S (P + tN - Q) ■ N = 0, 

or after using the rules of the dot product, 

(P - Q)' N + tN • N = 0. 

Solving for t yields 

. = (Q-P)- N l_ 

N • N 14 

Thus the desired point of intersection is 

P + tN = (1, -1, 2) + tVO, 2, 3) = (if, -if, fi). 

Example 3. Find the equation of the plane passing through the three 
points 

Pi = 0,2, -1), P 2 = (-1, 1,4), P 3 = (1,3, -2). 

We visualize schematically the three points as follows: 



Then we find a vector N perpendicular to P\P 2 and P\P Z , or in other 



[I, §5] 


LINES AND PLANES 


31 


words, perpendicular to P 2 — Pi and P 3 — P\. We have 

P2-P1 = (- 2 , -l,+ 5 ), 

Ps-P 1 = (0, 1, -1). 

Let N = (a, ft, c). We must solve: 

— la — b + 5c = 0, 
b — c — 0. 

We take b = c = 1 and solve for a, getting a = 2. Then 

= (2, 1, 1) 

satisfies our requirements. The plane perpendicular to N, passing through 
Pi is the desired plane. Its equation is therefore 

2x + >>-}- z = 2 + 2 — 1 = 3. 


Exercises 

Find a parametric equation for the line passing through the following points. 

1. (1, 1,-1) and (-2,1,3) 2. (-1, 5, 2) and (3,-4,1) 

Find the equation of the line in 2-space, perpendicular to A and passing 
through P, for the following values of A and P. 

3. A = (1,-1), P= (-5,3) 4. A = ( — 5,4 ), P = (3,2) 

5. Show that the lines 

3x — 5y = 1, 2x + 3y = 5 
are not perpendicular. 

6. Which of the following pairs of lines are perpendicular? 

(a) 3x — 5>’ = 1 and 2x -f y = 2 

(b) 2x + ly = 1 and x — y = 5 

(c) 3x — 5j> = 1 and 5x + 3y = 7 

(d) — x + y = 2 and x + y = 9 

7. Find the equation of the plane perpendicular to the given vector N and 
passing through the given point P. 

(a) N= (1,-1,3),P= (4,2,-1) 

(b) ?V= (—3, —2, 4), P = (2 ,tt,-5) 

(c) Af= (—1,0, 5), P = (2,3,7) 

8. Find the equation of the plane passing through the following three points. 

(a) (2,1,1), (3,-1,1), (4,1,-1) 

(b) ( — 2, 3,-1), (2,2, 3), (-4,-1,1) 

(c) (- 5 ,-1,2), (1,2,-1), (3,-1,2) 
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9. Find a vector perpendicular to (1,2, —3) and (2, —1,3), and another 
vector perpendicular to (—1, 3, 2) and (2, 1, 1). 

10. Let P be the point (1, 2, 3, 4) and Q the point (4, 3, 2,1). Let A be the 
vector (1,1, 1,1). Let L be the line passing through P and parallel to A. 

(a) Given a point X on the line L, compute the distance between Q and X 
(as a function of the parameter t). 

(b) Show that there is precisely one point Xo on the line such that this 
distance achieves a minimum, and that this minimum is 2V 5. 

(c) Show that Xo — Q is perpendicular to the line. 

11. Let P be the point (1, —1, 3, 1) and Q the point (1, 1, —1, 2). Let A be 
the vector (1, —3,2,1). Solve the same questions as in the preceding 
problem, except that in this case the minimum distance is V146/15. 

12. Find a vector parallel to the line of intersection of the two planes 

2x — y + z = 1, 3x + y + z = 2. 

13. Same question for the planes, 

2x -f- y + 5z = 2, 3x — 2y -f- z = 3. 

14. Find a parametric equation for the line of intersection of the planes of 
Exercises 12 and 13. 

15. Find the cosine of the angle between the following planes: 

(a) x + y + z = 1 (b) 2x + 3y — z = 2 

x — y — z = 5 x — y + z = l 

(c) x + 2y — z = 1 (d) 2x + y + z = 3 

—x + 3y + z = 2 —x — y + z = t 

16. (a) Let P = (1, 3, 5) and A = (—2,1, 1). Find the intersection of the line 

through P in the direction of A, and the plane 2x + 3y — z = 1. 

(b) Let P = (1,2, —1). Find the point of intersection of the plane 

3x — 4y + z = 2, 

with the line through P, perpendicular to that plane. 

17. Let Q = (1, —1, 2), P = (1, 3, —2), and N — (1, 2, 2). Find the point 
of the intersection of the line through P in the direction of N, and the plane 
through Q perpendicular to N. 

18. Let P, Q be two points and N a vector in 3-space. Let P' be the point of 
intersection of the line through P, in the direction of N, and the plane 
through Q, perpendicular to N. We define the distance from P to that plane 
to be the distance between P and P'. Find the distance when 

P= (1,3,5), Q = (-1,1,7), N = (-1,1,-1). 

19. In the notation of Exercise 18, show that the general formula for the 
distance is given by 


102 -P)-N\ 



[I, §6] 


THE CROSS PRODUCT 


33 


20. Find the distance between the indicated point and plane. 

(a) (1, 1, 2) and 3x + y — 5z = 2 

(b) (-1, 3, 2) and 2 jc - 4 + z = 1 

21. Let P = (1,3, —1) and Q = ( — 4,5,2). Determine the coordinates of 
the following points: 

(a) The midpoint of the line segment between P and Q. 

(b) The two points on this line segment lying one-third and two-thirds of 
the way from P to Q. 

(c) The point lying one-fifth of the way from P to Q. 

(d) The point lying two-fifths of the way from P to Q. 

22. If P, Q are two arbitrary points in n-space, give the general formula for the 
midpoint of the line segment between P and Q. 

% 

§6. The cross product 

You may omit ) this section and all references to it until you reach Chapter 
XV, where it will be used in an essential way. 

This section applies only in 3-space! 

Let A = (a ls a 2 , a 3 ) and B = (b i, b 2 , b 3 ) be two vectors in 3-space. 
We define their cross product 

A X B = (a 2 b 3 — a$b 2 , a^bi — a\b^,aib 2 — # 2 ^ 1 )- 

For instance, if A = (2, 3, —1) and B = (—1, 1, 5), 

then 

AX B = (16, -9, 5). 

We leave the following assertions as exercises: 

CP 1. AX B = -(B X A). 

CP 2 . A X (B + C) = (A X B) + (A X C), 

and 

(B+C)XA = BXA + CX A. 

CP 3. For any number a, we have 

(aA) X B = a(A X B) = A X ( aB ). 

CP 4. (A X B)X C = (A- C)B - (B ■ C)A. 

CP 5. A X B is perpendicular to both A and B. 

As an example, we carry out this computation. We have 

A - (A X B) = a!(a 2 b 3 - a 3 b 2 ) + a 2 (a 3 bi - 0 ^ 3 ) -}- o 3 («i ^2 - « 2 &i) 
= 0 
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because all terms cancel. Similarly for B • (A X B). This perpendicularity 
may be drawn as follows. 


Figure 27 

The vector A X B is perpendicular to the plane spanned by A and B. So 
is B X A, but B X A points in the opposite direction. 

Finally, as a last property, we have 

CP 6. (A X B) 2 = (A • A)(B • B) - (A • B) 2 . 

Again, this can be verified by a computation on the coordinates. Namely, 
we have 

(A X B)-(A X B) 

= (#2^3 — #3^2) 2 + (#3^1 — a\bo) 2 + (fll&2 — 02^l) 2 > 

(A ■ A)(B -B) - (A- Bf 

— 0?1 + #2 + tf 2 )(^l + b\ + £>3) — (fll&l + #2^2 + # 3 ^ 3 ) 2 - 

Expanding everything out, we find that CP 6 drops out. 

From our interpretation of the dot product, and the definition of the 
norm, we can rewrite CP 6 in the form 

|| A X B\\ 2 ='||^|| 2 ||fi|| 2 - |[/i|| 2 ||2?|| 2 cos 2 6, 
where 0 is the angle between A and B. Hence we obtain 

\\A X B\\ 2 = |M|| 2 ||5|| 2 sin 2 0 



or 

\\A X B\\ - \\A\\ ||£|| |sin 0|. 


This is analogous to the formula which gave us the absolute value of A • B. 

This formula can be used to make another interpretation of the cross 
product. Indeed, we see that || A X i?|| is the area of the parallelogram 
spanned by A and B, as shown on Fig. 28. 
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If we consider the plane containing the located vectors OA and OB, then 
the picture looks like that in Fig. 29, and our assertion amounts 
simply to the statement that the area of a parallelogram is equal to the 
base times the altitude. 





Example. Let A = (3, 1, 4) and B = (—2, 5, 3). Then the area of the 
parallelogram spanned by A and B is easily computed. First we get the 
cross product, 

A X B = (3 - 20, -8 - 9, 15 + 2) = (-17,-17, 17). 

The area of the parallelogram spanned by A and B is therefore equal to 
the norm of this vector, and that is 

||A X B\\ = VT~172 = l 7v / 3. 

These considerations will be used especially in Chapter XV, when we dis¬ 
cuss surface area, and in Chapter XIII, when we deal with the change of 
variables formula. 


Exercises 

Find A X B for the following vectors. 

1 . A = (1, -1, 1) and B = (-2, 3, 1) 

2. A = (-1, 1,2) and B = (1,0, -1) 
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3. A = (1, 1, -3) and B = (-1, -2, -3) 

4. Find A X A and B X B, in Exercises 1 through 3. 

5. Let Ei = (1, 0, 0), E 2 = (0, 1, 0), and £3 = (0, 0, 1). Find Ei X £ 2 , 
£2 X £3, £3 X Ei. 

6. Show that for any vector A in 3-space we have A X A — O. 

7. Compute £1 X (£1 X £ 2 ) and (£1 X £ 1 ) X £ 2 - Are these vectors equal 

to each other? 

8. Carry out the proofs of CP 1 through CP 4. 

9. Compute the area of the parallelogram spanned by the following vectors. 

(a) A = (3, —2, 4) and B = (5,1,1) 

(b) A = (3,1, 2) and B = (— 1, 2, 4) 

(c) A = (4, -2, 5) and B = (3,1, -1) 

(d) A = (-2,1, 3) and B = (2, -3, 4) 



CHAPTER II 


Differentiation of Vectors 


We begin to acquire the flavor of the mixture of algebra, geometry, 
and differentiation. Each gains in appeal from being mixed with the 
other two. 

The chain rule especially leads into the classical theory of curves. As 
you will see, the chain rule in its various aspects occurs very frequently 
in this book, and forms almost as basic a tool as the algebra of vectors, 
with which it will in fact be intimately mixed. 


§2. Derivative 

Let I be an interval. A parametrized curve (defined on this interval) is 
an association which to each point of I associates a vector. If X denotes 
a curve defined on /, and t is a point of I, then X(t) denotes the vector 
associated to t by X. We often write the association t ► X(t ) as an arrow 

X: I -* R”. 

Each vector X(j) can be written in terms of coordinates, 

X(t) = (xx(0, • • • , x n (0), ’ 

each Xj(/) being a function of t. We say that this curve is differentiable if 
each function x t (?) is a differentiable function of t. 

For instance, the curve defined by 

X(t) = (cos t, sin /, /) 

is a spiral (Fig. 1). Here we have 

x(r) = cos t, 

y(J) = sin t, 

z(t) — t. 
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Remark. We take the intervals of definition for our curves to be open, 
closed, or also half-open or half-closed. When we define the derivative 
of a curve, it is understood that the interval of definition contains more 
than one point. In that case, at an end point the usual limit of 

f(a + h) - f(a) 
h 

is taken for those h such that the quotient makes sense, i.e. a + h lies 
in the interval. If a is a left end point, the quotient is considered only 
for h > 0. If a is a right end point, the quotient is considered only for 
h < 0. Then the usual rules for differentiation of functions are true in 
this greater generality, and thus Rules 1 through 4 below, and the chain 
rule of §2 remain true also. [An example of a statement which is not 
always true for curves defined over closed intervals is given in Exer¬ 
cise 11(b).] 

Let us try to differentiate vectors using a Newton quotient. We consider 

X(t + h) — X(t) (x x (t + h) - Xx(t) x n (t + h) - x n (t)\ 
h \ h h ) 

and see that each component is a Newton quotient for the corresponding 
coordinate. If each Xi(t ) is differentiable, then each quotient 

Xj(t + h) - Xj(t) 

h 

approaches the derivative dxi/dt. For this reason, we define the derivative 
dX/dt to be 

dX _ /dxx ^ dx n \ 
dt \dt dt) 

In fact, we could also say that the vector 

\dt dt) 

is the limit of the Newton quotient 

X(t + h)~ X{t) 
h 


as h approaches 0. Indeed, as h approaches 0, each component 


Xj(t + h) - Xj(t) 

h 

t 

approaches dxi/dt. Hence the Newton quotient approaches the vector 
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For example, if X(t ) = (cos t, sin t, t) then 

dX , . 

= (— sin t, cos t, 1). 

Physicists often denote dX/dt by X\ thus in the previous example, we 
could also write 

X(t) = ( — sin t, cos t, 1) = X'(t). 


Figure 2 

We define the velocity vector of the curve at time t to be the vector 
X\t). In our previous example, when 

X(t) = (cos t, sin t, t ), 

the velocity vector at t = t is 

X'(ir) = (0, -1, 1), 

and for t = 7r/4 we get 

X'(t/4) = (-1/V2, 1/V2, 1). 

The velocity vector is located at the origin, but when we translate it to 
the point X(t), then we visualize it as tangent to the curve, as in the next 
picture. 

'(<) 


Figure 3 

We define the tangent line to a curve X at time t to be the line passing 
through X(t) in the direction of X'(t), provided that X'(t ) ^ O. Other¬ 
wise, we don’t define a tangent line. 

Example 1. Find a parametric equation of the tangent line to the curve 
X(t) = (sin t, cos t) at t = 7 t/3 . 
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We have 

X'(*/3) = (i, -V3/2) 


and X(r/3) = (vT/2, J). 


Let P = X(tt/3 ) and A = X'(jr /3). Then a parametric equation of the 
tangent line at the required point is 

(We use another letter L because X is already occupied.) In terms of the 
coordinates L(t) = (x(t), we can write the tangent line as 


x(t) = ~ + ^t, 


]_ 

2 


1 \/3 , 

xo - 2 “ T 1u 

We define the speed of the curve X{t) to be the length of the velocity 
vector. If we denote the speed by v(t), then by definition we have 

n(r) = ||*'(/)||, 

and thus 

v(t ) 2 = X'{t) 2 = X'(t) ■ X'(t). 

We can also omit the t from the notation, and write 

v = X' • X' = X' 2 . 


We define the acceleration vector to be the derivative dX'/dt, provided 
of course that X' is differentiable. We shall also denote the acceleration 
vector by X" . We define the acceleration scalar to be the length of the 
acceleration vector, and denote it by a(t). 

In the example given by X(t ) = (cos t, sin t, t) we find that 

X"(t) = (—cos t, —sin t, 0). 

Therefore ||A'' , (/)|| = 1 and we see that the spiral has a constant accelera¬ 
tion scalar, but not a constant acceleration vector. 

Warning. a(t ) is not necessarily the derivative of v(t). Almost any 
example shows this. For instance, let 

X(t) = (sin t, cos t). 

Then u(/) = ||JF(/)|| — 1 so that dv/dt = 0. However, a simple compu¬ 
tation shows that X"(t) = (cos t, —sin /) and hence a(t ) = 1. 

We shall list the rules for differentiation. These will concern sums, 
products, and the chain rule which is postponed to the next section. We 
make a remark concerning products. If X is a curve and / a function, 
defined on the same interval /, then for each t in this interval we can take 
the product 

f(0X(0 
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of the num01|Q^) by the vector X{t ). Thus if 

X(0 = (xi(0, • • • , x n (t)) 

then 

mm = ... ,/«*»«)■ 

For instance, if X(t) = (cos t, sin t, t) and f{t) — e l , then 

f{t)X{t ) = {e l cos t, e l sin t, e't), 

and 

/OT>) = (e'<— 1), e'(0), = (-e*. 0, e'»). 

The derivative of a curve is defined componentwise. Thus the rules 
for the derivative will be very similar to the rules for differentiating 
functions. 

Rule 1. Let X{t) and Y(t) be two differentiable curves (defined for the 
same values of t). Then the sum X(t) + Y{t) is differentiable, and 

d(XQ) + Y(Q) _ dX dfY 

dt dt dt 

Rule 2. Let c be a number, and let X(t) be differentiable. Then cX(t ) is 
differentiable, and 

d(cX(t)) dX 

dt ~ C dt ' 

Rule 3. Let f(t) be a differentiable function, and X{t) a differentiable 
curve {definedfor the same values of /). Then f(t)X(t) is differentiable, and 

d{fX) - f(t) dX I df X(t) 
dt ~ dt + dt 

Rule 4. Let X(t) and Y{t) be two differentiable curves {defined for the 
same values of t). Then X{t) • Y{t ) is a differentiable function whose 
derivative is 

I [XU) ■ YU)] = X'U) ■ YU) + XU) ■ HO- 

(This is formally analogous to the derivative of a product of functions, 
namely the first times the derivative of the second plus the second times 
the derivative of the first, except that the product is now a scalar product.) 

As an example of the proofs we shall give the third one in detail, and 
leave the others to you as exercises. 

Let X{t) = (xi(0,. • •, x„(0), and let / = f{t) be a function. Then 
by definition 

mm = (fu)xiU) ./«*.«)• 

We take the derivative of each component and apply the rule for the 
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derivative of a product of functions. We obtain: 


d(fX) _ 
dt 



+ .x*>ir 



Using the rule for the sum of two vectors, we see that the expression on 
the right is equal to 



We can take/ out of the vector on the left and df/dt out of the vector on 
the right to obtain 


as desired. 




Example 2. Let A be a fixed vector, and let/ be an ordinary differentia¬ 
ble function of one variable. Let F(t ) = f(t)A. Then F'(t ) = f'(t)A. 
For instance, if F(t ) = (cos t)A and A = (a, b ) where a , b are fixed num¬ 
bers, then F(t ) = (a cos /, b cos /) and thus 


F'(t) = ( — a sin t, —b sin /) = ( — sin t)A. 

Similarly, if A, B are fixed vectors, and 

G(J) = (cos t)A + (sin t)B, 

then 

G'(t ) = (—sin t)A -f- (cos t)B. 

One can also give a proof for the derivative of a product which does 
not use coordinates and is similar to the proof for the derivative of a 
product of functions. We carry this proof out. We must consider the 
Newton quotient 


X(t + h ) • Y(l + h) - X(t) • Y(t) 
h 

X(t + h ) • Y(t + h) - X(t) ■ Y(t + h)+ X(t) • Y(t + h)~ X(t ) • Y(t) 

h 

- x( ' - + h) h ~ m ■ n, + h)+ m■ y (‘ + h \~ w . 


Taking the limit as h —* 0, we find 


as desired. 


X'(t) • Y(t) + X(t) ■ Y\t) 




pi, §1] 


DERIVATIVE 


43 


Note that this type of proof applies without change if we replace the 
dot product by, say, the cross product. A coordinate proof for the deriva¬ 
tive of the cross product can also be given (cf. Exercise 25). 


Exercises 

Find the velocity vector of the following curves. 

1. (e f , cos t, sin t ) 2. (sin It, log (1 + 0, 0 

3. (cos t, sin /) 4. (cos 3f, sin 3/) 

5. In Exercises 3 and 4, show that the velocity vector is perpendicular to the 
position vector. Is this also the case in Exercises 1 and 2? 

6. In Exercises 3 and 4, show that the acceleration vector is in the opposite 
direction from the position vector. 

7. Let A, B be two constant vectors. What is the velocity vector of the curve 
X = A + tBl 

8. Let X(t) be a differentiable curve. A plane or line which is perpendicular 
to the velocity vector X\t) at the point X(t) is said to be normal to the curve 
at the point t or also at the point X(t). Find the equation of a line normal 
to the curves of Exercises 3 and 4 at the point 7r/3. 

9. Find the equation of a plane normal to the curve 

(e\ t , / 2 ) 

at the point t = 1. 

10. Same question at the point t = 0. 

11. Let X(t) be a differentiable curve defined on an open interval. Let Q be 
a point which is not on the curve. 

(a) Write down the formula for the distance between Q and an arbitrary 
point on the curve. 

(b) If to is a value of t such that the distance between Q and AX/o) is at a 
minimum, show that the vector Q — AX/o) is normal to the curve, at 
the point A(fo). [Hint: Investigate the minimum of the square of the 
distance.] 

(c) If X(t) is the parametric equation of a straight line, show that there 
exists a unique value to to such that the distance between Q and X(t o) 
is a minimum. 

12. Assume that the differentiable curve X(t) lies on the sphere of radius 1. 
Show that the velocity vector is perpendicular to the position vector. [Hint: 
Start from the condition X(t) 2 = 1.] 

13. Let A be a non-zero vector, c a number, and Q a point. Let Po be the 
point of intersection of the line passing through Q, in the direction of A, 
and the plane X • A = c. Show that for all points P of the plane, we have 

11(2 - Poll ^ II Q - P\l 

[Hint: If P ^ Fo, consider the straight line passing through Po and P, and 
use Exercise 11(c).] 
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14. Prove that if the acceleration of a curve is always perpendicular to its 
velocity, then its speed is constant. 

15. Let B be a non-zero vector, and let X(t ) be such that X(t) ■ B = t for all t. 
Assume also that the angle between X\t) and B is constant. Show that 
X"(t) is perpendicular to X'(t). 

16. Write a parametric equation for the tangent line to the given curve at the 
given point in each of the following cases. 

(a) (cos 4 1, sin 4 1, t ) at the point t = 7t/8 

(b) ( t , 2 1, t 2 ) at the point (1, 2, 1) 

(c) (e 3( , e~ M , 3y/21) at t = 1 

(d) (t, t‘-\ t 4 ) at the point (1, 1, 1) 

17. Let A, B be fixed non-zero vectors. Let 

X(t) = e 2t A + e~ u B. 

Show that X”(t ) has the same direction as X(t). 

18. Show that the two curves (e\ e 2t , 1 — e~ l ) and (1 — 9, cos 9, sin 9 ) inter¬ 
sect at the point (1,1, 0). What is the angle between their tangents at that 
point ? 

19. At what points does the curve (2 1 2 , 1 — t, 3 + t 2 ) intersect the plane 

3* - 14 y z - 10 = 0? 

20. Let X(t) be a differentiable curve and suppose that X'(t ) = O for all t 
throughout its interval of definition I. What can you say about the curve? 
Suppose X'(t) O but X"(t) = O for all t in the interval. What can you 
say about the curve ? 

21. Let X(t) = (a cos t, a sin t, bt ), where a , b are constant. Let d(t) be the 
angle which the tangent line at a given point of the curve makes with the 
z-axis. Show that cos d(t) is the constant b/Va* + b%. 

22. Show that the velocity and acceleration vectors of the curve in Exercise 21 
have constant lengths. 

23. Let B be a fixed unit vector, and let X(t) be a curve such that X(t) ■ B = e 2t 
for all t. Assume also that the velocity vector of the curve has a constant 
angle 9 with the vector B , with 0 < 6 < tt/2. 

(a) Show that the speed is 2c 2< /cos 9. 

(b) Determine the dot product X\t ) • X"(t) in terms of t and 9. 

24. Let 


m = 


1 + t 2 1 + t 2 ) 


Show that the cosine of the angle between X(t) and X'(t) is constant. 

25. Using the definition of the cross product by coordinates given in Chapter I, 
prove that if X(t) and Y(t) are two differentiable curves (defined for the 
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same values of t), then 


d[XQ) X YQ)] 
dt 


= Xit) X 


dY(t) 
dt 


+ 


dm 

dt 


x no. 


26. Show that 

^U(0 x x'u)] = XU) X X"U). 

27. Let r(0 = nO X X'U). Show that Y'(t) = X(t) X X"U). 

28. Let r(0 = nO- (X'U) X X"U))- Show that Y' = X- (X' X *'")• 


§2. Length of curves 

We define the length of a curve X between two values a, b of t (a ^ b) 
in the interval of definition of the curve to be the integral of the speed: 

f b v(t) dt = / 6 ||n(0|| dt. 

J a J a 

By definition, we can rewrite this integral in the form 

fj(if * ■+(*)■•*■ 

When n = 2, this is the same formula for the length which we gave in 
an earlier course. Thus the formula in dimension n is a very natural 
generalization of the formula in dimension 2. Namely, when 

xu) = wo, xo) 

is given by two coordinates, then the length of the curve between a and b 
is equal to 


HsMs)'*' 


Example. Let the curve be defined by 

X(t) — (sin t, cos 0- 


Then X'U) = (cos /, —sin 0 and v(t) = \/cos 2 t + sin 2 t = 1. Hence 
the length of the curve between t = 0 and t = 1 is 

r 1 1 

/ v(t) dt = t 
Jo 


= 1. 


In this case, of course, the integral is easy to evaluate. There is no reason 
why this should always be the case. 
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Example. Set up the integral for the length of the curve 

X(t) = (e\ sin t, t) 

between t = 1 and t = ir. 

We have X\t ) = ( e l , cos /, 1). Hence the desired integral is 

j \/e 2t + cos 2 t + 1 dt. 


In this case, there is no easy formula for the integral. In the exercises, 
however, the functions are adjusted in such a way that the integral can 
be evaluated by elementary techniques of integration. Don’t expect this 
to be the case in real life, though. 


Exercises 

1. Find the length of the spiral (cos t, sin t, t ) between t = 0 and t = 1. 

2. Find the length of the spiral (cos 2t, sin 2t, 2>t) between t = 1 and t = 3. 

3. Find the length of the indicated curve for the given interval: 

(a) (cos 4 1, sin 4 1, t) between t = 0 and t = 7t/8. 

(b) (/, 2 1 , t 2 ) between t - 1 and t - 3. 

(c) ( e 3t , e~ 3c , 3V2/) between t = 0 and t = 

4. Find the length df the curve defined by 

X(t) = (r — sin, t, 1 — cos t) 

between (a) t = 0 and t - 2w, (b) t — 0 and / — 7r/2. 

5. Find the length of the curve X(t) = (/, log t) between (a) t = 1 and t = 2, 
(b) / = 3 and t = 5. 

[Hint: Substitute u 2 = 1 + t 2 to evaluate the integral.] 

6. Find the length of the curve defined by X(t) = ( t , log cos /) between t = 0 
and t = 7t/4. 


§3. The chain rule and applications 

This section may be omitted if the course is pressed for time or other topics. 

Let X be a vector and c a number. As a matter of notation it will be 
convenient to define Xc to be cX, in other words, we allow ourselves to 
multiply vectors by numbers on the right. If we have a curve X(t) defined 
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for some interval, and a function g(t ) defined on the same interval, then 
we let 

xvm = g(t)x(o. 

Let X = X{t) be a differentiable curve. Let/ be a function defined on 
some interval, such that the values of/ lie in the domain of definition of 
the curve X(t). Then we may form the composite curve X of. If s is a 
number at which/ is defined, we let the value of X °/ at s be 

(*./)(s) = X(f(s)). 

For example, let X(t) = (7 2 , e l ) and let f(s ) = sin s. Then 

= (sin 2 s, e ain s ). 

Each component of X{f(s )) becomes a function of s, just as when we 
studied the chain rule for functions. 

Chain Rule. If X is a differentiable curve and f is a differentiable func¬ 
tion defined on some interval, whose values are contained in the interval 
of definition of the curve, then the composite curve X of is differentiable, 
and 

(*o/)'0) = X'(f(s))f’(s). 

The expression on the right can also be written f'(s)X'{f(s)). It is the 
product of the function /' times the vector X'. 

In another notation, if we let t = f(s), then we can write the above 
formula in the form 

d( Xof) = dXdi 
ds dt ds 

The proof of the chain rule is trivial, using the chain rule for functions. 
Indeed, let Y(s ) = X(f(s)). Then 

L(s) = (*i(/(*)), . . ., x n (f(s))). 

Taking the derivative term by term, we find: 

I"(s) = (*i (/«)/'(*). xkV(s))f'0)Y 

We can take f'(s) outside the vector, and get 
which is precisely what we want. 

The change of variables from t to s is also called a change of parametri- 
zation of the curve. Under certain changes of parametrization, certain 
formulas involving the velocity and acceleration of the curve become 
simpler and reflect geometric properties more clearly. We shall see 
examples of this in a moment. 
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Let us now assume that all the functions with which we dealt above 
have second derivatives. Using the chain rule, and the rule for the deriva¬ 
tive of a product, we obtain the following two formulas: 

(1) Y'(s) = 

(2) Y"(s) = f'Xs)X'(m) + (/'«) 2 A"'(/«). 

We shall consider an important special case of these formulas. 

We have defined 

»(/) = WOII 


to be the speed. Let us now assume that each coordinate function of X'(t) 
is continuous. In that case, we say that X'(t ) is continuous. Then v{t) is 
a continuous function of t. We shall assume throughout that v(t) ^ 0 
for any value of t in the interval of definition of our curve. Then v(t) > 0 
for all such values of /. We let 

s(t ) = Jv(t ) dt 

be a fixed indefinite integral of v(t) over our interval. (For instance, if 
a is a point of the interval, we could let 

rt 

s(t) = / v(u) du. 

J a 


We know that any two indefinite integrals of v over the interval differ 
by a constant.) Then 


ds 

dt 


= v(t) > 0 


for all values of t, and hence s is a strictly increasing function. Conse¬ 
quently, the inverse function exists. Call it 

t = /(*). 


We can then write 

m - *(/«) = y(s). 

Thus we are in the situation described above. 


The velocity vectors of the curve depending on the two different 
parametrizations are related as in formula (1). From the theory of deriva¬ 
tives of inverse functions, we know that 



Hence f'{s) is always positive. This means that in the present case, T'(s) 
and X'(t ) have the same direction when t = f(s). 
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A curve Y: J— > R” is said to be parametrized by arc length if || 7'(s)|| = 1 
for all s in the interval of definition J. The reason for this is contained in 
the next theorem. 

Theorem 1. Let X: I —> R” be a curve whose speed v(t) is > 0 for all 
t in the interval of definition. Let 

f ^ 

s(t ) = / v(u) du 
Ja 

and t = f(s) be the inverse function. Then the curve given by 

S y(s) = X(m) 

is parametrized by arc length , and Y’(s) is perpendicular to Y”(s) for 
each value of s. 

Proof From formula (1), we get 

lir'MIl- l/'MIII*'(OII = f §• 


By what we just saw above, this last expression is equal to 1. Thus 
is a vector of length 1, a unit vector, in the same direction as X'(t). 
the velocity vector of the curve Y has constant length. 

In particular, we have Y’(s) 2 = 1. Differentiating with respect 

Weget 27' • Y" = 0. 


Y'(s) 

Thus 

to s. 


Hence Y'(s) is perpendicular to Y'fs) for each value of s. This proves 
the theorem. 


From (2), we see that the acceleration 7"(s)ihas two components. 
First a tangential component ' 

f"(s)X '(/) 

parallel to X'(t), which involves the naive notion of scalar accelera¬ 
tion, namely the second derivative f"(s). Second, another component in 
the direction of X"(t ), with a coefficient 

(/’M ) 2 

which is positive. [We assume of course that X”(t) ^ O .] 

For a given value of t, let us assume that X\t) O and X"(t) ^ O, and 
also that X'(t ) and X"(t) do not lie on the same straight line. Then the 
plane passing through X(t), parallel to X'(t) and X"(t) is called the oscula¬ 
ting plane of the curve at time t, or also at the point X(t). [Actually, it is 
more accurate to say at time t, because there may be two numbers t lt t 2 in 
the interval of definition of the curve such that X(t i) = X(t 2 )-] 
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Example 1. Let X(t ) = (sin t, cos t, t). Find the osculating plane to 
this curve at / = w/2. 

We have 

X'(r/2) = (0, -1, 1) 

and 

X"(ir/2) = (-1,0,0). 

We find first a vector perpendicular to X'{tt/2) and X"(tt/2). For instance, 
N = (0, 1, 1) is such a vector. Furthermore, let P = X(t/2) = (1,0, tt/2). 
Then the osculating plane at t = tt/ 2 is the plane passing through P, 
perpendicular to N, and its equation is therefore 

y -h z = t/2. 

In case of parametrization by arc length, or in fact in any other 
parametrization such that f'(s) ^ 0, we see from formulas (1) and (2) 
that the plane parallel to X'(t) and X"(t) is the same as the plane parallel 
to F'( 5 ) and Y"(s) because from these formulas, we can solve back for 
X'(t) and X"(t) in terms of this other pair of vectors. Thus the osculating 
plane does not depend on a change of parametrization t = f(s ) such that 

/'(*)* o. 

Let us assume that a curve is parametrized by arc length. Thus we write 
the curve as Y(s), and by Theorem 1, we have || T'(s)|| = 1 and 

T'(s) • T"(s) = 0. 

Then T^s) and Y"(s ) look like this: 


Figure 4 



Example 2. Let R be a number > 0. A parametrization for the circle 
of radius R by arc length is given by 

y(s) = ^R cos ^ > R sin ^ > 


as one sees immediately, because || T'(5)|| = 1. 
Differentiating twice shows that 


Y"(s) - 
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and hence that 


II r"M|| = | 


or 


R = 


1 

7 "( 5 )|| ' 



For an arbitrary curve Y parametrized by arc length, it is customary 
to make a definition which is motivated by the geometry of the special 
example just discussed, namely we define the radius of curvature R(s ) to be 

R(s) = || FI 

at all points such that ||7"(s)|| ^ 0. (Note that if F"(s) = O on some 
interval, then Y(s ) = As + B for suitable vectors A , B, and thus Y param¬ 
etrizes a straight line. Thus intuitively, it is reasonable to view its radius 
of curvature as infinity.) 

The same motivation as above leads us to define the curvature itself 
to be || 7"(5)||. The curvature is usually denoted by k. 

Most curves are not usually given parametrized by arc length, and thus 
it is useful to have a formula which gives the curvature in terms of the 
given parameter t. This comes immediately from the chain rule. Indeed, 
keeping our notation X(t ) and 7(s) with ds/dt = v(t), we have the 
formula 


y " w = i Qo xw ) 


where v(t) = ||^'(/)|| is the length of the velocity vector X'(t). 
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Proof. From formula (1), we know that 

H.) - | X' U) = ^ *'«• 

By the chain rule, 

y„ (s s = d(Y'(s«))) <h t 
^ } dt ds 

which yields precisely the formula in the box. 

The curvature is then equal to the length of the vector in the box, that is: 


k = 


L d / 1 
v(t ) dt \u(/) 



Example 3. Find the curvature of the curve given by 

X(t ) = (cos t, sin t, t ). 

We have X'(t ) = (—sin t, cos t, 1) and v(t) = \/2 is constant. Then 
X"(f) = (—cos t, —sin t, 0), and from the formula for the curvature we 
find 


*(0 = 


1 

V2 


1 

V2 


X"(t) 


l 

2 ' 


We see in particular that the curve has constant curvature. 


Exercises 

1. Find the equations of the osculating planes for each of the following curves 
at the given point. 

(a) (cos 4 1, sin 4f, /) at the point t = 7r/8 

(b) (r, 2 1 , t 2 ) at the point (1, 2, 1) 

(c) (e 3t , e~ 3t , 3\/21) at t = 1 

(d) ( t, t 3 , r 4 ) at the point (1,1,1) 

2. Prove formula (2) from formula (1) in detail. 

3. Let r be a fixed number > 0, let c > 0, and let 

X{t) — (r cos t , r sin t, ct). 

Find the curvature as a function of t. j 

4. Find the curvature of the curve 

X(t) = (t, t 2 , t 3 ) 

at (a) t = 1 , (b) t = 0, (c) t = —1 . 
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5. Let the plane curve be defined by X(t ) = ( x(t), y(t))- Show that the 
curvature is given by 


HO 


x'(Qy"(Q - *"(0/(01 

(x'HO + / 2 ( 0) 3/2 


6. If a curve is parametrized by * = t, y = /(f) (the natural parametrization 
arising from a function y = /(*)), find a simplification for the curvature 
given in the preceding exercise. 

7. Find the radius of curvature of the curve X{t) = (f, log f). For which t is 
the radius of curvature a minimum? 

8. Find the curvatures of the curves 

(a) X(t) = ( t, sin f), 

(b) X(t) = (sin 3 1, cos 3 1), 

(c) X(t) = (sin 3 1, cos 3 1, t ). 

9. Find the radius of curvature of the parabola y = x 2 . 

10. Find the radius of curvature of the ellipse given by 

X(t) = ( a cos t, b sin f), 

where a, b are constants. 

11. Find the curvature of the curve defined by 


x(0 = 

y(0 = 



12. Find the curvature of the curve defined by 

■* t 

I 

x(0 


f cos u 

= / — —du , 
0 >fu 


y(0 = 



du 


in terms of the arc length s. 

13. Show that the curvature of the curve defined by 


m = (e‘, e-‘, V2 t) 
is equal to \/2/(e f — e~‘) 2 - 

14. If a curve has constant velocity and acceleration, show that the curvature 
is constant. Express the curvature in terms of the lengths of the velocity 
and acceleration vectors. 




CHAPTER III 


Functions of Several Variables 


We view functions of several variables as functions of points in space. 
This appeals to our geometric intuition, and also relates such functions 
more easily with the theory of vectors. The gradient will appear as a 
natural generalization of the derivative. In this chapter we are mainly 
concerned with basic definitions and notions. We postpone the important 
theorems to the next chapter. 


§1. Graphs and level curves 

In order to conform with usual terminology, and for the sake of brevity, 
a collection of objects will simply be called a set. In this chapter, we are 
mostly concerned with sets of points in space. 

Let S be a set of points in n-space. A function (defined on S) is an asso¬ 
ciation which to each element of S associates a number. 

In practice, we sometimes omit mentioning explicitly the set S, since 
the context usually makes it clear for which points the function is defined. 


Example 1. In 2-space (the plane) we can define a function f by the 
rule 

fix, y) = X* + y\ 

It is defined for all points (x, y) and can be interpreted geometrically as 
the square of the distance between the origin and the point. 

Example 2. Again in 2-space, let 


be defined for all 


f(x, y) = 


x — y 
x 2 + y 


2 


2 


(x, y) ^ (0, 0). 

We do not define / at (0, 0) (also written O). 
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Example 3. In 3-space, we can define a function /by the rule 
/(x, y, z) = x 2 - sin ( xyz ) + yz 3 . 

Since a point and a vector are represented by the same thing (namely 
an n-tuple), we can think of a function such as the above also as a function 
of vectors. When we do not want to write the coordinates, we write f(X ) 
instead of/(xi, . . . , x„). As with numbers, we call f(X) the value of/at 
the point (or vector) X. 

Just as with functions of one variable, one can define the graph of a 
function/of n variables Xi, . . . , x n to be the set of points in ( n + 1/space 
of the form 

(xi, . . ., x„,/(xi, . . ., x n )), 

the (xi, . . ., x n ) being in the domain of definition of / Thus when 
n = 1, the graph of a function /is a set of points (x,/(x)). When n = 2, 
the graph of a function/is the set of points (x, y,f(x, y)). When n = 2, it 
is already difficult to draw the graph since it involves a figure in 3-space. 
The graph of a function of two variables may look like this: 



Figure 1 


When we get to the graph of a function of three variables, it is of course 
impossible to draw it, since it exists in 4-space. However, we shall describe 
another means of visualizing the function. 

For each number c, the equation/(x, y) = c is the equation of a curve 
in the plane. We have considerable experience in drawing the graphs of 
such curves, and we may therefore assume that we know how to draw 
this graph in principle. This curve is called the level curve of / at c. It 
gives us the set of points (x, y) where/ takes on the value c. By drawing 
a number of such level curves, we can get a good description of the 
function. 


Example 1 ( continued ). The level curves are described by equations 


x 2 + y 2 


c. 


These have a solution only when c ^ 0. In that case, they are circles 
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(unless c = 0 in which case the circle of radius 0 is simply the origin). 
In Fig. 2, we have drawn the level curves for c = 1 and 4. 


y 



Figure 2 


The graph of the function z = f(x , y) = x 2 + y 2 is then a figure in 
3-space, which we may represent as follows. 



Example 2 (continued ). To find the level curves in Example 2, we have 
to determine the values (x, y) such that 

x 2 — y 2 = c(x 2 + y 2 ) 

for a given number c. This amounts to solving x 2 (l — c) = y 2 (l + c ). 

If x = 0, then /(0, y) = —1. Thus on the j>-axis our function has the 

constant value — 1. If x ^ 0, then we can divide by x in the above equality, 
and we obtain (for c ^ — 1) 

y^_ _ 1 — c 

x 2 1 + c 

Taking the square root, we obtain two level lines, namely 

y = ax and y = — ax , where a = -j-^c * 
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Thus the level curves are straight lines (excluding the origin). We have 
drawn some of them in Fig. 4. (The numbers indicate the value of the 
function on the corresponding line.) 

It would of course be technically much more disagreeable to draw the 
level lines in Example 3, and we shall not do so. 

y 


X 


Figure 4 

We see that the level lines are based on the same principle as the contour 
lines of a map. Each line describes, so to speak, the altitude of the func¬ 
tion. If the graph is interpreted as a mountainous region, then each level 
curve gives the set of points of constant altitude. In Example 1, a person 
wanting to stay at a given altitude need but walk around in circles. In 
Example 2, such a person should walk on a straight line towards or away 
from the origin. 

If we deal with a function of three variables, say f(x,y,z), then 

(x, y, z) = A’ is a point in 3-space. In that case, the set of points satisfying 

the equation N 

f(x, y,z)=c 

for some constant c is a surface.. The notion analogous to that of level 
curve is that of level surface. 

In physics, a function / might be a potential function, giving the value 
of the potential energy at each point of space. The level surfaces are then * 
sometimes called surfaces of equipotential. The function f might also give 
a temperature distribution (i.e. its value at a point X is the temperature 
at X ). In that case, the level surfaces are called isothermal surfaces. 

Exercises 



Sketch the level curves for the functions z = f(x, y), where /(;c, y) is given 
by the following expressions. 


1. x 2 + 2 y 2 2. y — x 2 3. y — 3x 2 

4. x — y 2 5. 3x 2 + 3y 2 6. xy 
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7. (x - l)(.y - 2) 


8. (x + 1 )(y + 3) 


10. 2x - 3 y 
2 

ix wi r 

13. 

14. 


4xy(x 


X 2 _|_ _y2 

x + T 


x - y 


11. 


xy 


2 \ 
y ) 


* 2 + p 2 


(try polar coordinates) 


15. 


2 . 2 
X + T 

x 2 — y 2 


2 2 

9 - + ^- 
4 + 16 


12 . 


xy 


x 2 + y* 


(In Exercises 11, 12, and 13, the function is not defined at (0,0). In Exercise 14, 
it is not defined for y = x, and in Exercise 15 it is not defined for y = x or 
y = -x.) 

16. (x - l) 2 + {y + 3) 2 17. x 2 - y 2 


§2 . Partial derivatives 

/" 

In this-seeiion and the next, we discuss the notion of differentiability 
for functions of several variables. When we discussed the derivative of 
functions of one variable, we assumed that such a function was defined 
on an interval. We shall have to make a similar assumption in the case 
of several variables, and for this we need to introduce a new notion. 

Let P be a point in n-space, and let a be a number > 0. The set of 
points X such that 

\\X- P\\ < a 

will be called the open ball of radius a and center P. The set of points X 
such that 

IIX-PII So 

will be called the closed ball of radius a and center P. The set of points X 
such that 

llx - i’ll = a 

will be called the sphere of radius a and center P. 

Thus when n = 1, we are in 1-space, and the open ball of radius a is 
the open interval centered at P. The sphere of radius a and center P 
consists only of two points. 

When n = 2, the open ball of radius a and center P is also called the 
open disc. The sphere is the circle. 

When n = 3, then our terminology coincides with the obvious inter¬ 
pretation we might want to place on the words. 

The following are the pictures of the spheres of radius 1 in 2-space and 
3-space respectively centered at the origin. 
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Figure 5 


Let Si be the sphere of radius 1, centered at the origin. Let a be a 
number >0. If A" is a point of the sphere Si, then aX is a point of the 
sphere of radius a, because 

||a*|| = ^11^11 = a. 

In this manner, we get all points of the sphere of radius a. (Proof?) Thus 
the sphere of radius a is obtained by stretching the sphere of radius 1, 
through multiplication by a. 

A similar remark applies to the open and closed balls of radius a, they 
being obtained from the open and closed balls of radius 1 through multi¬ 
plication by a, (Prove this as an exercise.) 

Let JJ be a set of points in n-space. We shall say that U is an open set 
in n-space if the following condition is satisfied: Given any point P in U, 
there exists an open ball B of radius a > 0 which is centered at P and 
such that B is contained in U. 


Example 1. In the plane, the set consisting of the first quadrant’, 
excluding the x- and j-axes, is an open set. 

The x-axis is not open in the plane (i.e. in 2-space). Given a point on 
the x-axis, we cannot find an open disc centered at the point and contained 
in the x-axis. 

On the other hand, if we view the x-axis as the set of points in 1-space, 
then it is open in 1-space. Similarly, the interval . 


-1 < x < 1 

is open in 1-space, but not open in 2-space, or n-space for n > 1. 


Example 2. Let U be the open ball of radius a > 0 centered at the 
origin. Then U is an open set. To prove this, let P be a point of this ball, 
so ||P|| < a. Say ||/ > || = b. Let c — a — b. If A" is a point such that 
\\X — P|| < c, then 


||*|| ^ |! A" - />|| + IIPH < a - b + b = a. 


Hence the open ball of radius c centered at P is contained in U. Hence 
U is open. 
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In the next picture we have drawn an open set in the plane, consisting 
of the region inside the curve, but not containing any point of the boun¬ 
dary. We have also drawn a point P in U, and a sphere (disc) around P 
contained in U. 



Figure 6 


When we defined the derivative as a limit of 

fix + h)- m 

h 

we needed the function / to be defined in some open interval around the 
point x. 

Now let/ be a function of n variables, defined on an open set U. Then 
for any point X In U, the function/is also defined at all points which are 
close to X, namely all points which are contained in anj>pdh ball centered 
at X and container in U. X ^ 

For small value/ of h, the point 

Oi ^ 2 , • • • > X n ) 

is contained in such an open ball. Hence the function is defined at that 
point, and we may form the quotient 

fix 1 + h, X 2 , . . . , X w ) ~ f(Xi, ■ . . , X n ) 

h 

s' 

If the limit exists as h tends to 0, then we call it the first partial derivative 
of/and denote it by Z>i/(xi, .. ., x„), or D x fiX ), or also by 

ii. 

dXi 

Similarly, we let 

w** - % 

_ fjx 1 , « « • > Xj h, . . t , X w ) /(Xi, » » » , xf) 

h—*0 h 

if it exists, and call it the i -th partial derivative. 

When n = 2 and we work with variables (x, y), then the first and second 
partials are also noted 


u 

dx 


and 


u. 

dy 
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By definition, we therefore have 

df = lim f(x + h,y) - fix, y) 
dx o h 

and 

df = lim fix, y + k) - f(x, y) 
dy fc—»o A: 

A partial derivative is therefore obtained by keeping all but one variable 
fixed, and taking the ordinary derivative with respect to this one variable. 

Example 3. Let fix,y)=x 2 y z . Then 

= 2xy 3 and = 3x 2 y 2 . 

dx dy 

We observe that the partial derivatives are themselves functions. This 
is the reason why the notation A/ is sometimes more useful than the 
notation df/dXi. It allows us to write Dif(P ) for any point P in the set 
where the partial is defined. There cannot be any ambiguity or confusion 
with a (meaningless) symbol D^fiP)), since f(P) is a number. Thus 
Dif(P) means (A/)(^)- It is the value of the function Drf at P. 

Example 4. Let fix, y) = sin xy. To find D 2 fil, tt), we first find 
df/dy, or D 2 fix, y), which is simply 

D 2 fix, y) = (cos xy)x. 

Hence 

D 2 f i 1, 7r) = (cos x) • 1 = —1. 

Also, 

»•/(>' 

Let/be defined in an open set U and assume that the partial derivatives 
of / exist at each point X of U. The vector 

Of,- ’!£) = ■■■.mv), 

whose components are the partial derivatives, will be called the gradient 
of / at X and will be denoted by grad/(x). One must read this 

(grad/XX), 

but we shall usually omit the parentheses around grad /. Sometimes one 
also writes V/ instead of grad /. 



[HI, §2] 


PARTIAL DERIVATIVES 


63 


If / is a function of two variables ( x , y ), then we have 
V/O, y ) = grad f{x, y) = ’ fy) ' 


Example 5. Let f(x,y) = x 2 y z . Then 

grad f(x, y) = (2 xy 3 , 3 x 2 y 2 ), 

so that in this case, 

U grad/(l, 2) = (16, 12). 

Thus the gradient of a function / associates a vector to a point X 
If / is a function of three variables (x, y, z), then 

grad 

Using the formula for the derivative of a sum of two functions, and 
the derivative of a constant times a function, we conclude at once that the 
gradient satisfies the following properties: 

Theorem 1. Let /, g be two functions defined on an open set JJ , and 
assume that their partial derivatives exist at every point of U. Let c be a 
number. Then 

grad (/ + g) = grad/ + grad g 
grad (c/) = c grad/. 

You should carry out the details of the proof as an exercise. 

We shall give later several geometric and physical interpretations for 
the gradient. 


Exercises 


Find the partial derivatives 


df. 

dx 


df 

dy 


J J „ i 

— > — ’ and 


df, 

dz 


for the following functions f(x, p) or f(x, y, z). 

1. xy + z 2. x 2 y 5 + 1 

4. cos(jcy) 5. sih(xpz) 

1.x 2 sin(yz) 8. xyz 

10. x cos(p “ 3z) + arcsin(jcp) 

11. Find grad/ ( P ) if P is the point (1, 2, 3) in Exercises 1, 2, 6, 8, and 9. 

12. Find grad/CP) if P is the point (1, t, it) in Exercises 4, 5, 7. 


3. sin(xp) + cos z 
6. e xyz 

9. xz + yz + xy 
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13. Find grad/(P) if 

f(x,y, z) = log(z + sin (y 2 - x)) 

and 

P= (1,-1,1). 

14. Find the partial derivatives of x v . 

Find the gradient of the following functions at the given point. 

15. f(x, y, z) = e~ 2x cos(yz) at (1, 7r, 7r) 

16. f{x, y, z) = e 3x+v sin(5z) at (0, 0, it/ 6 ) 

17. Prove that an open ball of radius a > 0 centered at some point Q is in 
fact an open set. 


§ 3 . Differentiability and gradient 

Let / be a function defined on an open set U. Let A! - be a point of U. 
For all vectors H such that ||//|| is small (and H ^ O ), the point X + H 
also lies in the open set. However we cannot form a quotient 

fjX + H) - fjX) 

H 

because it is meaningless to divide by a vector. In order to define what 
we mean for a function/ to be differentiable, we must therefore find a way 
which does not involve dividing by H. 

We reconsider the case of functions of one variable. Let us fix a num¬ 
ber x We had defined the derivative to be 


Let 


/'(*) - 


lim /(^+*)-/(*). 


g(h) = 


fix + h) - fix ) _ 


Then gih ) is not defined when h = 0, but 

lim gih ) = 0. 
h—*0 

We can write 

fix + h) - fix) = f'ix)h + hgih). 

This relation has meaning so far only when h ^ 0. However, we observe 
that if we define g(0) to be 0, then the preceding relation is obviously 
true when h = 0 (because we just get 0 = 0). 

Furthermore, we can replace h by —h if we replace g by —g. Thus we 
have shown that if / is differentiable, there exists a function g such that 

fix + h) - fix) = fix)h + \h\gih), 
lim gih ) - 0. 

h-y 0 


(1) 
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Conversely, suppose that there exists a number a and a function g(h) 
such that 


(la) 


We find for h 5 * 0, 


f(x + h) - /( x) = ah + | h\g(h), 

lim g(h) = 0. 

/1-+0 


/C»+»)-/< a_ a+ J*L tW . 


Taking the limit as h approaches 0, we observe that 

lim -f - g(h) = 0. 

Hence the limit of the Newton/qnotienfexists and is equal to a. Hence 
/is differentiable, and its derivative f'(x ) is equal to a. 

Therefore, the existence of a number a and a function g satisfying (la) 
above could have been used as the definition of differentiability in the case 
of functions of one variable. The great advantage of (1) is that no h 
appears in the denominator. It is this relation which will suggest to us 
how to define differentiability for functions of several variables, and how 
to prove the chain rule for them. 

We now consider a function of n variables. 

Let / be a function defined on an open set U. Let X be a point of U. 
If H = (hi, ..., h n ) is a vector such that ||//|| is.small enough, then 
X + H will also be a point of U and &of(X + H) is defined. Note that 

X + H = (*i + hi, ..., x n + A n ). 

This is the generalization of the x + h with which we dealt previously. 

When / is a function of two variables, which we write (jc, y), then we 
use the notation H = ( h, k ) so that 

X+H= (x+h,y+k). 

The point X + H is close to X and we are interested in the difference 
f(X + H) — f(X), which is the difference of the value of the function at 
X + H and the value of the function at X. If this difference approaches 0 
when H approaches O, then we say that/ is continuous. We say that/ is 
differentiable at X if the partial derivatives D\f(X), . . ., D n f(X) exist, 
and if there exists a function g (defined for small H) such that 

lim g(H) = 0 (also written lim g(H) = 0) 

h ~*o Ill'll -»o 

and 


f(X + H) - f(X) = Dif(X)h x + • • • + D n f(X)h n + \\H\\g(H). 
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With the other notation for partial derivatives, this last relation reads: 
f(X + H) - /( X) - |£ A, + • • • + & K + 

We say that / is differentiable in the open set U if it is differentiable at 
every point of U, so that the above relation holds for every point X in U. 

In view of the definition of the gradient in §2, we can rewrite our funda¬ 
mental relation in the form 


( 2 ) 


f(X +H)~ f(X) = (grad f(X)) • H + \\H\\g(H). 


The term ||//'||g(//) has an order of magnitude smaller than the previous 
term involving the dot product. This is one advantage of the present 
notation. We know how to handle the formalism of dot products and are 
accustomed to it, and its geometric interpretation. This will help us later 
in interpreting the gradient geometrically. 

For the moment, we observe that the gradient is the only vector which 
will make formula (2) valid (cf. Exercise 5). 

In two variables, the definition of differentiability reads 

fix + h,y + k)~ f(x,y) = + §4 + ||tf||g(ff). 


We view the term 


— h 
dx 



as an approximation to f(X + H) — f(X), depending in a particularly 
simple way on h and k. 

If we use the abbreviation 


grad / = V/, 

then formula (2) can be written 


f(X +H)~ f{X) = V/(X) • H + ||//||g(H)- 


As with grad/, one must read (V/) (X) and not the meaningless v(/(A')) 
since f(X) is a number for each value of X, and thus it makes no sense to 
apply V to a number. The symbol V is applied to the function / and 
(V/) (X) is the value of Vf at X. 





[HI §3] 


DIFFERENTIABILITY AND GRADIENT 


67 


Example. Suppose that we consider values for H pointing only in the 
direction of the standard unit vectors. In the case of two variables, 
consider fof instance H = (h, 0). Then for such H, the condition for 
differentiability reads: 

f(X+ H) = f(x + h,y) = f(x, y) + %h+ \h\g(H). 


In higher dimensional space, let £, = (. . . , 0, 1, 0, . . .) be the i-th 
unit vector. Let H = hE{ for some number h , so that 


H = (. . . , 0, h, 0, . . .). 


Then for such H, 

f(X + H) = f(X + hE<) = /(.X) + h + \h\g(H). 

Examplt?/. We can often estimate error terms with an expression 
\\H\\g(Hi where g(H ) approaches 0 as ||//|| approaches 0, by using 
standard properties of the absolute value, namely 

\a + b\ ^ \a\ + |fc|. 

For instance, let H = ( h , k) where h, k are numbers. Then by definition, 

||//|| = V7* 2 + k 2 and ||7f|| 2 = h 2 + k 2 . 

Observe that 

h **h* + k* = \\H\\ 2 . 

Hence 

\h\ i l|//||. 

] 

Similarly, * / 

| h 2 + hk\ ^ \h 2 \ + \hk\ = \h\ 2 + \h\ \k\. 

Hence 

| h 2 + hk\ ^ ||//|| 2 + ||//|| ||//|| ^ 2\\H\\ 2 . 

Example. You should read this example in connection with the last step 
of the proof of the next theorem. If you do not wish to put too much 
emphasis on theory, take the next theorem for granted and skip both this 
example and the proof. Let g l} g 2 be functions defined for small values of 
H such that 

lim gi(H) = 0 and lim gz(H) ~ 0. 

H-,0 H—.0 



68 


FUNCTIONS OF SEVERAL VARIABLES 


[III, §3] 


We want to see that the expression 

hgi(H) + kg 2 (H) 

can be put in the form \\H\\g(H) where lim g(H) = 0. We write 

H -+0 

hgi(H) + kg 2 (H ) = ||tf || gl (H) + ||tf || pi g,(H) 

= ||ff|| [pjj gl (H) + ||/f|| g 2 (H) . 

Let g(H) be the expression in brackets. Each factor A/||//'|| and k/||iJ|| 
has absolute value ^1. Hence each one of the terms inside the bracket 
approaches 0 as H approaches O. Thus we have written 

kg i (H) + kg 2(H) = ||tf||g(//), 

as desired. 

Theorem 2. Let f he a function defined on some open set U. Assume 
that its partial derivatives exist for every point in this open set, and that 
they are continuous. Then f is differentiable. 

Proof For simplicity of notation, we shall use two variables. Thus we 
deal with a function f(x, y). We let H = (h, k). Let (x, be a point 
in U, and take H small, H 9 * (0,0). We have to consider the difference 
f{X + H) - fiX), which is simply 

fix + h,y + k) - fix, y). 

This is equal to 

fix + h,y + k) - fix, y + k)+ fix, y + k) - fix, y). 

Applying the mean value theorem for functions of one variable, and 
applying the definition of partial derivatives, we see that there is a number 
s between x and x + h such that 

(3) fix + h,y + k) - fix, y + k) = Dffis, y + k)h. 

Similarly, there is a number t between y and y + k such that 

(4) fix, y + k) - fix, y) = D 2 fix, t)k. 

We shall now analyze the expressions on the right-hand side of equations 
(3) and (4). 

Let 

gi(H) = Dffis, y + k) - D x fix,y). 
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As H approaches 0, (s, y + k) approaches (x, y ) because 5 is between x 
and x +rh. Since Z>i/is continuous, it follows that 

lim gi(H) = 0. 

H —*0 

But 

£>1 f(s,y + k) = D x f(x, y) + 


HefiafequaTion (3) can be rewritten as 

(5) fix + h,y + k) - fix, y + k) = D t fix, y)h + hg t (ff). 
By a similar argument, we can rewrite equation (4) in the form 


(6) f(x, y + k) - fix, y) = D 2 f(x, y)k + kg 2 (H) 


with some function g 2 iH ) such that 

lim g 2 iH) = 0. 

H —»0 


If we add (5) and (6) we obtain 

(7) f(X + H) - f(X) = D,f(X)h + D 2 f(X)k + hgi(ff) + kg 2 (H). 

In view of the example given before our theorem, we see that the last 
two terms on the right are of the form \\H\\g(H). This proves the theorem. 

Remark 1. If we dealt with n variables, then we would consider the 
expression for f(X + H) — fiX) given by 


fix i + h u ..., x n + h n ) - /(*i, x 2 + h 2 ,..., x n + h n ) 

4" fix 1 , x 2 4~ k 2 , . . . , Xn ~f" hn) fixu X 2 , . . . , Xn "I - hf) 

4” fix i, • • • j x n —\, Xn 4~ h n ) fix i, • • • > X n )‘ 

We would then apply the mean value theorem at each step, take the sum, 

and argue in essentially the same way as withjwo variables. 

Remark 2. Some sort of smoothness assumption on the function 
besides the existence of the partial derivatives must be made in order to 
insure that it is differentiable at a point. For instance, consider the 
function / defined by 

f( x > y ) = X * X + y 2 if y ^ ^ (0,0) 

/( 0 , 0 ) = 0 . 


You should have worked out the level lines for this function, and found 
that they are given by straight lines through the origin. In particular, you 
see that the function is not continuous at the origin. However, its partial 
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derivatives exist and are easily computed by using the definitions, namely: 

Dif(0, 0) = lim /(° + h > °) ~ °) 

h->o n 

— lim -——— = lim 0 = 0. 

A->0 « A—>0 

Similarly, Z> 2 /(0, 0) = 0. Now do Exercise 8. 


Exercises 

1. Let f(x, y) = 2x — 3 y. What is df/dx and df/dy 2 

2. Let A = (a, b) and let / be the function on R 2 such that f(X) = A • X. 
Let X = ( x , y). In terms of the coordinates of A, determine df/dx and 
df/dy. 

3. Let A = (a, b, c ) and let /be the function on R 3 such that f(X) - A • X. 
Let X = (x, y, z). In terms of the coordinates of A, determine df/dx, 
df/dy, and df/dz. 

4. Generalize the above two exercises to w-space. 

5. Let/be defined on an open set U. Let A" be a point of U. Let A be a vector, 
and let g be a function defined for small H, such that 

lim g(H) = 0. 

H—*0 

Assume that 

f(X +H)~ f(X) = A- H+ \\H\\g{H). 


Prove that A — grad f(X). You may do this exercise in 2 variables first 
and then in 3 variables, and let it go at that. Use coordinates, e.g. let 
A = (a, b) and X = (*, y). Use special values of H. 

6. Let H = ( h, k). Prove: 

(a) \h 2 + 3hk\ ^ 4\\H\\ 2 . (b) |A 3 + h 2 k + k 3 \ ^ 3||i/|| 3 . 

(c) |3 hk 2 + 2h 3 \ ^ 5||if|| 3 . (d) | (h + k)±\ ^ 16||H|| 4 . 

(e) |(A-fA:)| 3 <S\\H\\ 3 . 


7. Let 


g(h, k) 


h 2 - k 2 
h* + k 2 


be defined for (ft, k ) ^ (0, 0). Find 
lim g(h, k), 

lim g(h, k), 

k-> 0 


lim | lim g(h, k ) 
o La-,o 

lim [ lim g(h, k) 
h —>o L >0 


8. Compute the partial derivatives of the function f(x, y) given at the end of the 
section at any point (*, y) ^ (0,0) by the usual formulas. You see that the 
partial derivatives exist everywhere, but the function is not continuous. 
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The Chain Rule and the Gradient 


In this chapter, we prove the chain rule for functions of several variables 
and give a number of applications. Among them will be several inter¬ 
pretations for the gradient. These form one of the central points of our 
theory. They show how powerful the tools we have accumulated turn 
out to be. 


§/. The chain rule 

Let / be a function defined on some open set U. Let 1 1 —> X(t) be a 
curve such that the values X(t ) are contained in U. Then we can form 
the composite function f° X, which is a function of t , given by 

(/■>*)( o = /(*«)■ 

As an example, take f(x, y) = c x sin(xy). Let X(t ) = (/ 2 , / 3 ). Then 

f(X(t)) = e' 2 sin(/ 5 ). 

This is a function of t in the old sense of functions of one variable. 

The chain rule tells us how to find the derivative of this function, 
provided we know the gradient of/ and the derivative X' . Its statement 
is as follows. 

Chain Rule. Let f be a function which is defined and differentiable on an 
open set U. Let X: / —► R n be a differentiable curve (defined for some 
interval of numbers t) such that the values X(t) lie in the open set U. 
Then the function 

is differentiable (as a function of t ), and 

(grad f(X(t))) ■ X'(t). 

In the notation dX/dt, this also reads 

tlLm^ grad/)(*(,))•£• 
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Before proving the chain rule, we restate it in terms of components. 
If X = (*i, . . ., x n ) then 


djfim)) = i/ dx, . a/ 

dt dxi dt dx n dt 


If / is a function of two variables ( x , y ) then 


df(X(D) a/ dx df dy 

dt dx dt dy dt 


This can be applied to the seemingly more general situation when x, y are 
functions of more than one variable t. Suppose for instance that 

x = <p(t, u ) and y = ^(r, u ) 

are differentiable functions of two variables. Let 

g(t, u ) = f(<p(t, u ), i(t, u )). 

If we keep u fixed and take the partial derivative of g with respect to t, 
then we can apply our chain rule, and obtain 

IS = dfdx dfdy 
dt dx dt dy dt 


The components are of course useful in computations, to determine 
partial derivatives explicitly, but they will not be used in the proof. 

Proof of the chain rule. By definition, we must investigate the quotient 


Let 


fixjt + - fim) 

h 

K = K(t, h) = X(t + h) - X(t). 


Then our quotient can be rewritten in the form 




jim + k)~ /<*( o) 

- • 

h 


Using the definition of differentiability for f we have 

fiX +K)~ f(X) = (grad f)(X) • K + \\K\\g(K) 
and 

lim g(K) = 0. 

11*11->o 
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Replacing K by what it stands for, namely X(t + h) — X(/), and dividing 
by h, we obtain : 


r(x(t + /.)) - Am ) 

h 


(grad /)(*(<)) 


X(t + h) - X(t) 


X(t +h)- X(t) 


h 

g(K). 


As h approaches 0, the first term of the sum approaches what we want, 
namely 

(grad/)(*(<)) • X\t). 


The second term approaches 

± 11^(011 lirn g(X), 

h —>0 

and when h approaches 0, so does K = X(t + h) — X(t). Hence the 
second term of the sum approaches 0. This proves our chain rule. 


Example 1. Let f(x, y) = x 2 + 2xy. Let x = r cos 6 and y = r sin 6. 
Let g(r, 6) = f(r cos 6, r sin 0) be the composite function. Find dg/dd. 
We have 


Hence 


dx 

00 


— r sin 0 


and 


dy 

00 


r cos 0. 


= (2x + 2y )(—rsin 0) + 2x(rcos 0). 
do 


If you want the answer completely in terms of r, 0, you can substitute 
r cos 0 and r sin 0 for x and y respectively in this expression. 


Example 2. Let w = f(x, y, z ) = e xy cos z and let 
x = tu, y = sin (/«), z = u 2 . 

Then 

dw _ 0/ dx df dy 0/ 0z 
dll dx du dy du dz du 

= ye xv (cos z)/ + :>ce XJ/ (cos z)(cos tu)t — c XJ/ (sin z)2w 

= sin(/«y M8in(<M) (cos« 2 )/+ tue tusin(tu \cos u 2 )(cos tu)t 
- e tu3 ' mitu) (s\n w 2 )2w. 

In this last expression, we have substituted the values for x, y, z in terms 
of t and u, thus giving the partial derivative completely in terms of these 
variables. 
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Example 3. Sometimes the letters jc and y are occupied to denote 
variables which are not the first and second variables of the function/. In 
this case, other letters must be used if we wish to replace Z)]/ and D 2 f by 
partial derivatives with respect to these variables. For example, let 

u = f(x 2 - y, xy ). 

To find du/dx, we let 

s = x 2 — y and t = xy. 

Then 

du _ df ds df dt 
dx ds dx dt dx 

= % 2x + %y= /(*, t)2x + D 2 f(s, t)y. 

The function u depends on x, y and we may write u = g(x, y). Then 

( 1 ) 

Similarly, 

( 2 ) 

We may then solve the linear equations (1) and (2), and we find for 
instance 

df = 1 fdg | 2 agl 

dt y + 2x 2 |_dx: dy\ 

The advantage of the D^f D 2 f notation is that it does not depend on a 
choice of letters, and makes it clear that we take the partial derivatives of 
/ with respect to the first and second variables. On the other hand, it is 
slightly more clumsy to write D 1 f(s, t ) rather than df/ds. Thus the second 
notation, when used with an appropriate choice of variables, is shorter and 
a little more mechanical. We emphasize, however, that it can only be used 
when the letters denoting the variables have been fixed properly. 

Example 4. Let/be a function on R 3 . Let us interpret/as giving the 
temperature, so that at any point X in R 3 , the value of the function f(X) 
is the temperature at X. Suppose that a bug moves in space along a 
differentiable curve, which we may denote in parametric form by 

1 1—>■ B(t). 

Thus B(t) = (jc(t), y(t), z(tj) is the position of the bug at time t. Let us 
assume that the bug starts from a point where he feels that the temperature 


£ = ¥. 2 * + Vv. 


dx ds 


dt 


dy ds dt 
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is comfortable, and therefore that the temperature is constant along the 
path on which he moves. In other words, / is constant along the curve 
B(t). This means that for all values of t, we have 

f(m) = c, 

where c is constant. Differentiating with respect to t, and using the chain 
rule, we find that 

grad / ( B(t )) • B'(t) = 0. 

This means that the gradient of/ is perpendicular to the velocity vector at 
every point of the curve. 

grad 


Figure 1 

Example 5. Let f(x, y, z ) = g(x 2 — 3 zy + xz), where g is a differen¬ 
tiable function of one variable. Then the chain rule becomes much simpler, 
and we find 

|£ = g'(x 2 - 3 zy -f xz)(2x + z). 

We denote the derivative of g by g' as usual. We do not write it as dg/dx, 
because x is a letter which is already occupied for other purposes. We 
could let 

u = x 2 — 3 zy -j- xz, 

in which case it would be all right to write 

df _ dg du ? 
dx du dx 

and we would get the same answer as above. In general, if h(x, y, z) is a 
function of x,y, z, and g is a function of one variable, then we may form 
the composite function 

f(x, y, z ) = g(h(x, y, z)). 

“ g,( - h(x ’ y ’ z » % ■ 



We then have 
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Written in terms of the first partial, we have the longer (but more accurate) 
expression 

£>if(x, y, Z ) = g'(fi(x, y, z))Dih(x, y, z ). 

Through practice, you will recognize which notation to use most effi¬ 
ciently, depending on the cases to be considered. 

Example 6. Let g(t, x,y) = f(t 2 x, ty). Then 

~ = Dif(t 2 x, ty)2tx + D 2 f(t 2 x, ty)y. 

Here again, since the letter x is occupied, we cannot write df/dx for D\f 


Exercises 


(All functions are assumed to be differentiable as needed.) 

V' 

1. If x = u(r, s, t) and y = v(r, s, t ) and z = f(x, y), write out the forhiula for 


dz , dz 

— and — 

dr dt 


2. Find the partial derivatives with respect to x, y, s, and t for the following 
functions. 

(a) f{x, y, z) = x 3 + 3 xyz — y 2 z, x = 21 + s, y = —t — s, z = t 2 + s 2 

(b) f(x, j) = ix + y)/i 1 - xy), x = sin It, y = cos(3/ - s) 

3. Let fix, y, z) — ix 2 + y 2 + z 2 ) 1/2 . Find df/dx and df/dy. 

4. Let r = (jcf + • • • + *^) 1/2 . What is dr/dx <? 

5. If u = fix — y, y — x), show that 


du du 
dx dy 


6. If u = x^fiy/x, z/x), show that 


du du du 

+ y^~ + z t = 3w - 

dx dy dz 


7. (a) Let x = r cos 0andy = rsin0. Let z = fix,y). Show that 


dz 

dr 


— cos 9 + ~ sin 0, 
dx dy 


1 dz 
r dd 


df . „ , df 
~ sin 0 + cos 0. 
dx dy 


(b) If we let z = gir, 0) = fir cos 0, r sin 0), show that 



a 
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8. (a) Let g be a function of r, let r = || X\\, and X = (x, y, z). Let f(X) = g(r). 

Show that 

if - (if 4if - (if ■ 

(b) Let g(x, y) = fix + y,x — y), where / is a differentiable function of 
two variables, say / = f(u, v). Show that 

dg dg _ (dj\ 2 _ /df\ 2 
dx dy \du/ \dv / 

(c) Let g(x, y) = f(2x + 7y), where / is a differentiable function of one 
variable. Show that 

2 dg = ~dg 
dy dx 

9. Let g be a function of r, and r = || A'H. Let f(X) = g(r). Find grad/ ( X ) 
for the following functions. 

(a) g(r) = 1/r (b) g(r) = r 2 

(c) g(r ) = 1/r 3 (d) g(r) = e~ r2 

1 

(e) g(r) = log- (f) g(r) = 4/r m 

10. Let x = u cos 6 — usin0, and y = u sin d + vcosO, with 6 equal to a 
constant. Let f(x, y) = g(u, v). Show that 

(if* (if- (if-'($'■ 

The next five exercises concern certain parametizations, and some of the results 
from them will be used in Exercises 16 and 17. 

11. Let A, B be two unit vectors such that A • B = 0. Let 

F(t) = (cos t)A + (sin t)B. 

Show that F{t) lies on the sphere of radius 1 centered at the origin, for each 
value of t. 

12. Let P, Q be two points on the sphere of radius 1, centered at the origin. 
Let L(t) = P + tiQ — P), with 0 ^ t ^ 1. If there exists a value of t in 
[0,1] such that Lit) = O, show that / = and that P = —Q. 

13. Let P, Q be two points on the sphere of radius 1. Assume that P —Q. 
Show that there exists a differentiable curve joining P and Q on the sphere 
of radius 1, centered at the origin. [Hint: Divide Lit) in Exercise 12 by its 
length.] 

14. If P, Q are two unit vectors such that P = —Q, show that there exists a 
differentiable curve joining P and Q on the sphere of radius 1, centered at 
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the origin. You may assume that there exists a unit vector A which is per¬ 
pendicular to P. Then use Exercise 11. 


15. Parametrize the ellipse 


2 2 

- + ^= 1 
fl2 ^ 62 


by a differentiable curve. 

16. Let/ be a differentiable function (in two variables) such that grad f(X) = cX 
for some constant c and all X in 2-space. Show that / is constant on any 
circle of radius a > 0, centered at the origin. [Hint: Put x = a cos t and 
y = a sin t and find df/dt.] 

17. (a) Generalize the preceding exercise to the case of n variables. You may 

assume that any two points on the sphere of radius a centered at the 
origin are connected by a differentiable curve. 

(b) Let / be a differentiable function in n variables, and assume that there 
exists a function g such that grad f(X) = g(X)X. Show that /is constant 
on the sphere of radius a > 0 centered at the origin. (In other words, 
in Exercise 16, the hypothesis about the constant c can be weakened 
to an arbitrary function.) 

18. Let r = ||Y||. Let g be a differentiable function of one variable whose 
derivative is never equal to 0. Let f(X) — g(r). Show that grad f(X) is 
parallel to X for X ^ O. 

19. Let /be a differentiable function of two variables and assume that there is 
an integer m ^ 1 such that 


f(tx, ty) = t n fi.x,y) 

for all numbers / and all x, y. Prove Euler’s relation 




20. Generalize Exercise 19 to n variables, namely let/ be a differentiable function 
of n variables and assume that there exists an integer m Si 1 such that 
f(tX ) = t m f(X) for all numbers t and all points X in R n . Show that 


x\ 


dx\ 


+ * * ' + x„ 


Bf 

dx„ 


= mf(X), 


which can also be written X • grad f (X) = mf(X). How does this exercise 
apply to Exercise 6 ? 

21. Let / be a differentiable function defined on all of R n . Assume that 
f(tP) = r/CP) for all numbers t and all points P in R n . Show that for all 
P we have 


fiP) = grad f(Q) • P. 
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§2. Tangent plane 

Let / be a differentiable function and c a number. The set of points X 
such that f(X) = c and grad f(X) ^ O is called a surface. 

Let X(t) be a differentiable curve. We shall say that the curve lies on 
the surface if, for all t, we have 

Km) - c. 

This simply means that all the points of the curve satisfy the equation 
of the surface. If we differentiate this relation, we get from the chain rule: 

grad/(*(0) •*'(/) = 0. 

Let P be a point of the surface, and let X(t) be a curve on the surface 
passing through P. This means that there is a number t 0 such that 
X(t 0 ) = P. For this value t 0 , we obtain 

grad/(P) • X'(t 0 ) = 0. 

Thus the gradient of / at P is perpendicular to the tangent vector of the 
curve at P. [We assume that X'(t 0 ) O .] This is true for any differentiable 
curve passing through P. It is therefore very reasonable to define the plane 
(or hyperplane) tangent to the surface at P to be the plane passing through 
Fand perpendicular to the vector grad f(P). (We know from Chapter XVII 
how to find such planes.) This definition applies only when grad/( P ) ^ O. 
If grad/( P ) = O, then we do not define the notion of tangent plane. 

The fact that grad / ( P ) is perpendicular to every curve passing through 
P on the surface also gives us an interpretation of the gradient as being 
perpendicular to the surface 

/(*) = c, 

which is one of the level surfaces for the function / (Fig. 2). 


grad f(P) 



Figure 2 
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Example 1. Find the tangent plane to the surface 

x 2 + y 2 + z 2 = 3 

at the point (1, 1, 1). 

Let f(X) = x 2 + y 2 + z 2 . Then at the point P = (1, 1, 1), 

grad/(F) = (2, 2, 2). 

The equation of a plane passing through P and perpendicular to a vector 


In the present case, this yields 

7.x T* 2 y T 2 z — 2 —}— 2 —2 = 6. 

Observe that our arguments also give us a means of finding a vector 
perpendicular to a curve in 2-space at a given point, simply by applying 
the preceding discussion to the plane instead of 3-space. 

Example 2. Find the tangent line to the curve 

x 2 y + y 3 = 10 

at the point (1, 2), and find a vector perpendicular to the curve at that 
point. 

Let fix , y) = x 2 y + y 3 . The gradient at the given point P is easily 
computed, and we find 

grad/(F) = (4, 13). 

This is a vector N perpendicular to the curve at the given point. The tan¬ 
gent line is also given by X • N = F • N, and thus is 

Ax + \2>y = 4 + 26 = 30. 

Example 3 . A surface may also be given in the form z = g(x, y) where 
g is some function of two variables. In this case, the tangent plane is 
determined by viewing the surface as expressed by the equation 

g(x, y) - z = 0. 

For instance, suppose the surface is given by z = x 2 + y 2 - We wish to 
determine the tangent plane at (1, 2, 5). Let f(x,y, z) = x 2 + y 2 — z. 
Then grad / (x, y, z) = (2x, 2 y, — 1) and 

grad/(l, 2, 5) = (1,4, -1). 

The equation of the tangent plane at F = (1,2,5) perpendicular to 
A =(1,4, — 1) is 

x-\~ Ay — z = P- N= 4. 

This is the desired equation. 
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Exercises 

1. Find the equation of the tangent plane and normal line to each of the 
following surfaces at the specific point. 

(a) jc 2 + y 2 + z 2 = 49 at (6, 2, 3) 

(b) xy + yz + zx — 1 = 0 at (1, 1, 0) 

(c) x 2 + xy 2 + y 3 + z + 1 = 0 at (2, —3, 4) 

(d) 2 y — z 3 — 3xz = 0 at (1, 7, 2) 

(e) x 2 y 2 + xz — 2 y 3 = 10 at (2, 1, 4) 

(f) sin xy + sin yz + sin xz = 1 at (1, ir/2, 0) 

2. Let f(x, y, z) = z — e x sin y, and P = (log 3, 3ir/2, —3). Find: 

(a) grad/(F), 

(b) the normal line at P to the level surface for / which passes through P, 

(c) the tangent plane to this surface at P. 

3. Find the parametric equation of the tangent line to the curve of intersection 
of the following surfaces at the indicated point. 

(a) x 2 + y 2 + z 2 = 49 and x 2 + y 2 = 13 at (3, 2, —6) 

(b) xy + z = 0 and x 2 + y 2 + z 2 — 9 at (2, 1,-2) 

(c) x 2 — y 2 — z 2 = 1 and jc 2 — y 2 + z 2 = 9 at (3, 2, 2) 

[Note. The tangent line above may be defined to be the line of intersection 
of the tangent planes of the given point.] 

4. Let f(X) = 0 be a differentiable surface. Let Q be a point which does not 
lie on the surface. Given a differentiable curve X{t) on the surface, defined 
on an open interval, give the formula for the distance between Q and a 
point X(t). Assume that this distance reaches a minimum for t = to. Let 
P = X(t o). Show that the line joining Q to P is perpendicular to the curve 
at P. 

5. Find the equation of the tangent plane to the surface z = f(x,y) at the 
given point P when/ is the following function: 

(a ) f(x,y) = x 2 + y 2 ,P = (3,4, 25) 

(b ) f(x,y) = x/(x 2 + _y 2 ) 1/2 , P = (3, -4, f) 

(c) f(x, y) = sinOty) at P = (1,7r, 0) 

6. Find the equation of the tangent plane to the surface x = e 2y ~ z at (1, 1, 2). 


§3. Directional derivative 

Let / be defined on an open set and assume that / is differentiable. Let 
P be a point of the open set, and let A be a unit vector (i.e. ||y4|| = 1). 
Then P + tA is the parametric equation of a straight line in the direction 
of A and passing through P. We observe that 

d(P + tA) 

— it—~ A - 
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For instance, if n = 2 and P = (p, q), A = (a, b ), then 

P + tA = (p + ta, q + tb ), 
or in terms of coordinates, 


Hence 


so that 


x = p A- ta, y = q + tb. 



and 


dy 

dt 


b 


d(P + tA) 
dt 


(a, b ) = A. 


The same argument works in higher dimensions. 

Hence by the chain rule, if we take the derivative of the function 
t | —► f{P + tA), which is defined for small values of t, we obtain 

d fiP j t tA) = grad/(/> + I A) ■ A. 

When t is equal to 0, this derivative is equal to 

grad f(P) ■ A. 

For obvious geometrical reasons, we call it the directional derivative of 
/ in the direction of A. We interpret it as the rate of change of / along 
the straight line in the direction of A, at the point P. Thus if we agree 
on the notation D A f(P ) for the directional derivative of / at P in the 
direction of the unit vector A, then we have 


DaKP) = 


df(P + tA) 
dt 


t =o 


grad /(F) * A. 


In using this formula, the reader should remember that A is taken to be 
a unit vector. When a direction is given in terms of a vector whose length 
is not 1, then one must first divide this vector by its length before applying 
the formula. 


Example. Let f(x,y) = x 2 + y 3 and let B = (1, 2). Find the direc¬ 
tional derivative of /in the direction of B, at the point (—1, 3). 

We note that B is not a unit vector. Its length is y/5. Let 


A = 



Then A is a unit vector having the same direction as B. Let P = (—1,3). 
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Then grad/(F) = (—2,27). 
derivative is equal to: 

grad f(P) • A 


Hence by our formula, the directional 


1 

x/5 


(-2 + 54) = 


52 

\/5 


Consider again a differentiable function / on an open set U. 

Let P be a point of U. Let us assume that grad/ ( P ) O, and let A 

be a unit vector. We know that 


grad f(P)-A = ||grad/(P)|| |L4||cos0, 

where 0 is the angle between grad f(P) and A. Since ||/4|| = 1, we see 
that the directional derivative is equal to ||grad f(P)\\ cos 0. The value of 
cos 0 varies between — 1 and +1 when we select all possible unit vectors A. 

The maximal value of cos 0 is obtained when we select A such that 
0 = 0, i.e. when we select A to have the same direction as grad/( [P ). 
In that case, the directional derivative is equal to the length of the gradient. 

Thus we have obtained another interpretation for the gradient: 

Its direction is that of maximal increase of the function, and its length 
is the rate of increase of the function in that direction. 

The directional derivative in the direction of A is at a minimum when 
cos 0 = — 1. This is the case when we select A to have opposite direction 
to grad/ ( [P ). That direction is therefore the direction of maximal de¬ 
crease of the function. 

For example,/might represent a temperature distribution in space. At 
any point P, a particle which feels cold and wants to become warmer 
fastest should move in the direction of grad/ (P). Another particle which 
is warm and wants to cool down fastest should move in the direction of 
— grad/(F). 


Exercises 

1. Let f{x, y,z) = z — e x sin y, and P = (log 3, 'iir/l, —3). Find: 

(a) the directional derivative of/ at P in the direction of (1, 2, 2), 

(b) the maximuni and minimum values for the directional derivatives of 
/ at P. 

2. Find the directional derivatives of the following functions at the specified 
points in the specified directions. 

(a) log(* 2 -f- y 2 ) X! 2 at (1, 1), direction (2, 1) 

(b) xy + yz + zx at (—1, 1, 7), direction (3, 4, —12) 

(c) 4x 2 + 9y 2 at (2, 1) in the direction of maximum directional derivative 

3. A temperature distribution in space is given by the function 

f(x, y) = 10 + 6 cos x cos y + 3 cos 2x + 4 cos 3 y. 
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At the point (tt/3, tt/ 3), find the direction of greatest increase of tempera¬ 
ture, and the direction of greatest decrease of temperature. 

4. In what direction are the following functions of X increasing most rapidly 
at the given point ? 

(a) */||*|| 3/2 at(l, -1,2) (X= (. x,y,z )) 

(b) Ill'll 5 at (1, 2, -1, 1) (X = (x,y,z,w)) 

5. Find the tangent plane to the surface x 2 + y 2 — z 2 = 18 at the point 
(3, 5, -4). 

6. Let f(x, y, z) — (jc + y) 2 + (y + z ) 2 + (z + *) 2 . What is the direction 
of greatest increase of the function at the point (2, — 1, 2). What is the direc¬ 
tional derivative of /in this direction at that point? 

7. Let f{x,y) = x 2 + xy + y 2 . What is the direction in which /is increasing 
most rapidly at the point (—1, 1)? Find the directional derivative of /in 
this direction. 


§4. Conservation law 

As a final application of the chain rule, we derive the conservation law 
of physics. 

Let U be an open set. By a vector field on U we mean an association 
which to every point of U associates a vector of the same dimension. 

If/ is a differentiable function on U, then we observe that grad/ is a 
vector field, which associates the vector grad/ (P) to the point P of U. 

A vector field in physics is often interpreted as a field of forces. 

If F is a vector field on U, and X a point of U, then we denote by F(X) 
the vector associated to X by F and call it the value of F at X, as usual. 

If F is a vector field, and if there exists a differentiable function / 
such that F = grad /, then the vector field is called conservative. Since 
— grad/ = grad (—/), it does not matter whether we use/ or —/ in the 
definition of conservative. 

Let us assume that F is a conservative field on U, and let 4> be a differ¬ 
entiable function such that for all points A" in U we have 

F(X) = —grad 4>. 

In physics, one interprets $ as a potential function. Suppose that a 
particle of mass m moves along a differentiable curve X(t) in U, and let us 
assume that this particle obeys Newton’s law: 

F(X) = mX", that is F(X(t)) = mX'(t) 
for all t where X(t) is defined. Then according to our hypotheses, 

mX" + grad (X) = O. 

Take the dot product of both sides with X'. We obtain 

ml' • X" + grad 4> (X) • X' = 0. 
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But the derivative (with respect to /) of X' 2 is 2X' ■ X ". The derivative 
with respect to t of <S>(X(0) is equal to 

grad $ (X) • X’ 

by the chain rule. Hence the expression on the left of our last equation 
is the derivative of the function 

\mX' 2 + *(*), 

and that derivative is 0. Hence this function is equal to a constant. This 
is what one means by the conservation law. 

The function \mX' 2 is called the kinetic energy, and the conservation 
law states that the sum of the kinetic and potential energies is constant. 

It is not true that all vector fields are conservative. We shall discuss 
the problem of determining which ones are conservative in the next 
chapter. 

The fields of classical physics are for the most part conservative. For 
instance, consider a force which is inversely proportional to the square 
of the distance from the point to the origin, and in the direction of the 
position vector. Then there is a constant C such that for X ^ O we have 

w=c WW’ 

x 

because .. .. is a unit vector in the direction of X. Thus 

II **11 

F(X) - ci 

where r — ||X||. A potential function for F is given by 

C 

r 

This is immediately verified by taking the partial derivatives of this 
function. 


Exercises 

1. Find a potential function for a force field which is inversely proportional 
to the distance from the point to the origin, and is in the direction of the 
position vector. 

2. Same question, replacing “distancewith “cube of the distance”. 

3. Let k be an integer ^1. Find a potential function for the vector field F 

given by ^ 

F(X) = j k X, where r = || *||. 


[Hint: Cf. Exercise 9(0 of §1.] 




CHAPTER V 


Potential Functions and Curve Integrals 


We are going to deal systematically with the possibility of finding a 
potential function for a vector field. The discussion of the existence of 
such a function will be limited to the case of two variables. Actually, 
there is no essential difficulty in extending the results to arbitrary n-space, 
but we leave this to the reader. (Cf. the answer section.) 

The problem is one of integration, and the line integrals are a natural 
continuation of the integrals at the end of §1 (taken on vertical and 
horizontal lines). 

§i. Potential functions 

Let Fbe a vector field on an open set U. If <p is a differentiable function 
on U such that F = grad <p, then we say that <p is a potential function for F. 
(Or, in hip terminology, a pot function.) 

One can raise two questions about potential functions. Are they unique, 
and do they exist? 

We consider the first question, and we shall be able to give a satisfactory 
answer to it. The prqblem is analogous to determining an integral for a 
function of one variable, up to a constant, and we shall formulate and 
prove the analogous statement in the present situation. 

We recall that even in the case of functions of one variable, it is not 
true that whenever two functions /, g are such that 

df = dg, 

dx dx 

then / and g differ by a constant, unless we assume that /, g are defined 
on some interval. As we emphasized in the First Course, we could for 
instance take 

- + 5 if * < 0, 
x 

- — 7T if x > 0, 

x 

g(x) = ~ if ^0. 

A 
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Then /, g have the same derivative, but there is no constant C such that 
for all x ^ 0 we have /(*) = g(x) + C. 

In the case of functions of several variables, we shall have to make a 
similar restriction on the domain of definition of the functions. 

Let U be an open set and let P , Q be two points of U. We shall say 
that P, Q can be joined by a differentiable curve if there exists a differen¬ 
tiable curve X(t ) (with t ranging over some interval of numbers) which is 
contained in U, and two values of t, say t\ and t 2 in that interval, such 
that 

X(t x ) = P and X(t 2 ) = Q. 

For example, if U is the entire plane, then any two points can be joined 
by a straight line. In fact, if P, Q are two points, then we take 

X(t) = P + t(Q- P). 

When t = 0, then *(0) = P. When t = 1, then Z(l) = Q. 

It is not always the case that two points of an open set can be joined 
by a straight line. We have drawn a picture of two points P, Q in an 
open set U which cannot be so joined. 


Figure 1 

We are now in position to state the theorem we had in mind. 

Theorem 1. Let U be an open set, and assume that any two points in U 
can be joined by a differentiable curve. Let f, g be two differentiable 
functions on U. If grad f (X) = grad g (X) for every point X of U, then 
there exists a constant C such that 

AX) = g(X)+c 

for all points X of U. 

Proof. We note that grad (/ — g) = grad / — grad g = O, and we 
must prove that f — g is constant. Letting <p = f — g, we see that it 
suffices to prove: If grad <p (X) = O for every point X of U, then <p is 
constant. 

Let P be a fixed point of U and let Q be any other point. Let X(t) be 
a differentiable curve joining P to Q, which is contained in U, and defined 
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over an interval. The derivative of the function y>(X(0) is, by the chain 
rule, 

grad v (*«) ' X'(t). 


But X(t) is a point of U for all values of t in the interval. Hence by our 
assumption, the derivative of (p{X(t )) is 0 for all / in the interval. Hence 
there is a constant C such that 

¥>(*(0) = C 

for all t in the interval. In other words, the function <p is constant on the 
curve. Hence <p(P) = <p(Q ). 

This result is true for any point Q of U. Hence <p is constant on U, as 
was to be shown. 

Our theorem proves the uniqueness of potential functions (within the 
restrictions placed by our extra hypothesis on the open set U). 

We still have the problem of determining when a vector field F admits 
a potential function. 

We first make some remarks in the case of functions of two variables. 
Let F be a vector field (in 2-space), so that we can write 

F(x, y) = (f(x,y),g(x,y)) 

with functions/and g, defined over a suitable open set. We want to know 
when there exists i function <p(x, y) such that 


^ - / and 
dx J 



Such a function would be a potential function for F, by definition. (We 
assume throughout that all hypotheses of differentiability are satisfied as 
needed.) 

Suppose that such a function <p exists. Then 


df = ±(dA d «« = jL(V). 

dy dy\dx/ dx dx\dy/ 


We shall show in the next chapter that under suitable hypotheses, the 
two partial derivatives on the right are equal. This means that if there 
exists a potential function for F, then 

df = dg 
dy dx 

This gives us a simple test in practice to tell whether a potential function 
may exist. 
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Theorem 2. Let f g be differentiable functions having continuous partial 
derivatives on an open set U in 2-space. If 

Of. 9g 
dy dx 

then the vector field given by F(x, y) = (/(x, y), g(x, y)) does not have a 
potential function. 

Example. Consider the vector field given by 

F(x, y) = (x 2 y, sin xy). 

Then we let f(x, y) = x 2 y and g(x, y) = sin xy. We have: 

df 2 , dg 

~ = x and = v cos xy. 

dy dx 7 7 

Since df/dy dg/dx, it follows that the vector field does not have a 
potential function. 

We shall prove in §3 that the converse of Theorem 2 is true in some very 
important cases. Before stating and proving the pertinent theorem, we 
first discuss an auxiliary situation. 


Exercises 


Determine which of the following vector fields have potential functions. 
The vector fields are described by the functions (/(*, y), g(x, y)). 


1. (1/x, xe xy ) 

3. (e xv , e x+v ) 

5. (5 x*y, x cos(xy)) 


2 . 

4. 

6 . 


(sin(;cy), cos(xy)) 

(3 x y ,x y) 

( — ~ ~ ’ 3 xy 

\y x 2 +■ y 2 


§2. Differentiating under the integral 

Let/be a continuous function on a rectangle a ^ x ^ b and c ^ y ^ d. 
We can then form a function of y by taking 

*00 = f b f(x,y)dx . 

Ja 


Example 1. We can determine explicitly the function ^ if we let 
f(x,y) = sin(xy), namely: 


*00 = [ sin(xy) dx = — £ os ( i O0 
Jo y 


cos(7rj0 — 1 

y 
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We are interested in finding the derivative of \f/. The next theorem 
allows us to do this in certain cases, by differentiating with respect to 
y under the integral sign. 

Theorem 3. Assume that f is continuous on the preceding rectangle, and 
that D 2 f exists and is continuous. Let 

*00 = / b f(x,y)dx. 

J a 

Then \p is differentiable, and 

t-w-S. = I BJ ^r dx - 

Proof. By definition, we have to investigate the Newton quotient for 
We have 

*0 ’+ h)- *QQ = f b r f(x, y + h)- f(x, y) 1 ^ 

h J a L h J 

We then have to find 

Em f f -^ y + h) ~ f(X ' y) dx. 

h—* 0 J a ft 

If we knew that we can put the limit sign inside the integral, we would 
then conclude that the preceding limit is equal to 

f lim /(*> y+h) - f{x, y) dx = f ^ ^ 

Ja h—> 0 ft Ja 

thus proving our theorem. We shall not give the argument which justifies 
moving the limit sign inside the integral, because it depends on (e, 6) 
considerations which are mostly omitted in this book. 

Example 2. Letting/(x, y) = sin(xy) as before, we find that 


If we let 

then 


D 2 f(x, y) = x cos(xy). 

*00 = f f(x, y) dx, 

J 0 

£*00 = / D 2 f{x, y)dx = / x cos(xj>) dx. 
Jo Jo 


By evaluating this last integral, or by differentiating the expression found 
for \f/ at the beginning of the section, the reader will find the same value, 


namely 


£*00 
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We can apply the previous theorem using any x as upper limit of the- 
integration. Thus we may let 

Hx, y) = f f(t, y ) dt, 

J a 

in which case the theorem reads 

We use / as a variable of integration to distinguish it from the x which is 
now used as an end point of the interval [a, x] instead of [a, b]. 

The preceding way of determining the derivative of \f/ with respect to y 
is called differentiating under the integral sign. Note that it is completely 
different from the differentiation in the fundamental theorem of calculus. 
In this case, we have an integral 

g(x) = f f(t) dt, 

J a 

and 

f x = Dg(x) = /(*). 

Thus when/ is a function of two variables, and ^ is defined as above, the 
fundamental theorem of calculus states that 

It = D M X ’ y) = /(*> JO- 


For example, if we let 

then 

but by Theorem 3, 


Hx, y) = J sin(/y) dt, 

J 0 

£>iMx,y) = sin(xj), 

D 2 yp(x,y) = f cos (ty)tdt. 
Jo 


Exercises 

In each of the following cases, find D\\p(x,y) and y), by evaluating 

the integrals. 


1. \f/(x, y) = J e tv dt 


3. M*, y) = J (y + ty dt 


2. i(x 
4 . \f/(x 


■»-L 


cos (ty) dt 


e y+t dt 
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5. Kx, p) = 
7. \p(x, y ) = 



e v '‘ dt 

log(oO 

t 


dt 


6. Hx, ;y) = 


8. i Kx,y) = 


t 2 y dt 



sin(3o9 dt 


§ 3 . Local existence of potential functions 

We shall state a theorem which will give us conditions under which the 
converse of Theorem 2 is true. 

Theorem 4. Let f, g be differentiable functions on an open set of the 
plane. If this open set is the entire plane, or if it is an open disc, or the 
inside of a rectangle, if the partial derivatives of f g exist and are con¬ 
tinuous, and if 

df = dg, 

dy dx 

then the vector field F(x, y) = (/(x, >>), g(x, >>)) has a potential function. 

We shall indicate how a proof of Theorem 4 might go for a rectangle 
after we have discussed some examples. 

Example 1. Determine whether the vector field F given by 

F(x, y) = e I+ ») 

has a potential function. 

Here,/(x, y) = e xy and g(x, y) = e x+v . We have: 

& = xe xv and = e* +V - 

dy dx 

Since these are not equal, we know that there cannot be a potential 
function. 

If the partial derivatives df/dy and dg/dx turn out to be equal, then 
one can try to find a potential function by integrating with respect to 
one of the variables. Thus we try to find 

f f(x, y) dx, 

keeping y constant, and taking the ordinary integral of functions of one 
variable. If we can find such an integral, it will be a function ^(x, y), 
whose partial with respect to x will be equal to f(x,y ) (by definition). 
Adding a function of y, we can then adjust it so that its partial with 
respect to y is equal to g(x, j). 

Example 2. Let F(x, y) = (2 xy, x 2 + 3 y 2 ). Determine whether this 
vector field has a potential function, and if it does, find it. 
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Applying the test which we mentioned above, we find that a potential 
function may exist. To find it, we consider first the integral 



dx, 


viewing y as constant. We obtain x 2 y for the indefinite integral. We 
must now find a function u(y ) such that 


~ ( x 2 y + i/O)) = x 2 + 3 y 2 . 
This means that we must find a function u(y) such that 


or in other words, 


du 

dy ~ 


x 2 + 3/ 



This is a simple integration problem in one variable, and we find u(y) = 
Thus finally, if we let 

<p(x, y) = x 2 y + y 3 , 

we see that <p is a potential function for F. 

Proof of Theorem 4. We let the rectangle be defined by 
a ^ x 5s b and c ^ y ^ d. 

We let 

<p(x, y) = f f(t, y)dt -t- [ g(a, u ) du. 

J a J c 


Then the second integral on the right does not depend on x, and by the 
fundamental theorem of calculus, 


DMx,y) = f(x, y) 

as wanted. On the other hand, using Theorem 3, and differentiating with 
respect to y, we get: 

D 2 <p(x, y) = ( D 2 f(t, y) dt + g(a, y) 

J a 

= f Digit, y) dt + g(a, y) 

J a 

t-=x 

= y) + g(a, y) 

t—a 


= g(x, y) - g(a, y ) + g(a, y) 

= g(x, y) 

thus yielding the desired expression for the second partial of <p. This 
proves Theorem 4. 
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Note that the proof is entirely similar to that of the example preceding 
it. The first integral with respect to x solves the requirements or the first 
partial of <p, and we correct it by an integral involving only y in order to 
adjust the answer to give the desired partial with respect to y. 

Exercises 

Determine which of the following vector fields admit potential functions. 

1. (e x , sin xy ) 2. (2 x 2 y, y 3 ) 

3. (2xy, y 2 ) 4. (y 2 * 2 , x + y 4 ) 

Find potential functions for the following vector fields. 

5. (a) F(X) = -X (b) F(X) = \x 

r r* 

(c) F(20 — r n X (if n is an integer ^ —2). In this Exercise, 

r = ||*||, and X * O. 

6. (4 xy, 2x 2 ) 7. ( xy cos xy + sin xy, x 2 cos xy) 

8. (3* 2 y 2 , 2x*y) 9. (2x, 4 y 3 ) 

10. (a) (ye xy , xe xy ) (b) (y cos xy, x cos xy) 

11. Let r = \\X\\. Let g be a differentiable function of one variable. Show 
that the vector field defined by 

g'(r) 

F(X) = ^ X 
r 

in the domain X ^ O always admits a potential function. What is this 
potential function ? 

12. Generalize Theorem 4 to functions of three variables. Assume that we have 
a vector field F = (/u/ 2 ,/ 3 ) defined on a 3-dimensional rectangle 

[fli, bi] X [< 22 , 62 ] X [ 03 , 63 ], 

satisfying the conditions 


dj i = df i 

dxj dxi 


(or Djfi = Difj) 


for all indices i, j. Prove that this vector field has a potential function. 
Extend this to more than 3 variables. 


13. Find a potential function/for the following vector fields F given as F(x,y, z). 

(a) (2x, 3 y, 4z) (b) (y + z, x + z, x + y) 

(c) (e v+2i , xe v+2i , 2xe y+2z ) (d) (y sin z, x sin z, xy cos z) 

(e) ( yz, xz + z 3 , xy + 3yz 2 ) (f) (e yz , xze yz , xye vz ) 

(g) (z 2 , 2 y, 2xz) (h) (yz cos xy, xz cos xy, sin xy) 
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%4. Curve integrals 

Let U be an open set (of n-space), and let F be a vector field on U. 
We can represent F by components: 

F(X) = {MX) . MX)), 

each fi being a function. When n = 2, 

= (f(x,y),g{x,y)). 


If each function fi(X), . . ., f n (X ) is continuous, then we shall say that 
Fis a continuous vector field. If each function f\(X), .. . ,f n (X) is differ¬ 
entiable, then we shall say that F is a differentiable vector field. 

We shall also deal with curves. Rather than use the letter X to denote 
a curve, we shall use another letter, for instance C, to avoid certain con¬ 
fusions which might arise in the present context. Furthermore, it is now 
convenient to assume that our curve C is defined on a closed interval 
/ = [a, b], with a < b. For each number t in /, the value C(t ) is a point 
in n-space. We shall say that the curve C lies in U if C(t) is a point of U 
for all t in I. We say that C is continuously differentiable if its derivative 
C'{t ) = dC/dt exists and is continuous. We abbreviate the expression 
“continuously differentiable” by saying that the curve is a C 1 -curve, or 
of class C 1 . 

Let F be a continuous vector field on U, and let C be a continuously 
differentiable curve in U. The dot product 

mo) ■ f 

is a function of t, and it can be shown easily that this function is con¬ 
tinuous (by € and 5 techniques which we always omit). 

Example 1. Let F(x, y) = ( e xy , y 2 ), and C(t) = (t, sin t). Then 

<7(0 = (1, cos 0 

and 


F(C(0) = (e tsint , sin 2 /). 

Hence 

F(C(0) ■ <7(0 = <?‘ sin< + (cos 0(sin 2 t). 

Suppose that C is defined on the interval [a, b]. We define the integral 
of F along C to be 


L f =[ f(c(,)) ■ 


dC 

dt 


dt. 
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This integral is a direct generalization of the familiar notion of the integral 
of functions of one variable. If we are given a function /(«), and u is a 
function of /, then 



/(«) du = 



(This is the formula describing the substitution method for evaluating 
integrals.) 

In n-space, C(a ) and C(Jb) are points, and our curve passes through 
these two points. Thus the integral we have written down can be inter¬ 
preted as an integral of the vector field, along the curve, between the two 
points. It will also be convenient to write the integral in the form 




• dC 


to denote the integral along the curve C, from P to Q. 

Example 2. Let F(x, y ) = ( x 2 y , y 3 ). Find the integral of F along the 
straight line from the origin to the point (1, 1). 

We can parametrize the line in the form 


C(t) = (/, /). 


Thus 


Furthermore, 


Hence 


F(C(0) = 0 3 , t 3 ). 


§ - ■>• 


Km) • ^ = 2< 3 . 


The integral we must find is therefore equal to: 



1 

2 ' 


Remark 1. Our integral of a vector field along a curve is defined for 
parametrized curves. In practice, a curve is sometimes given in a non- 
parametrized way. For instance, we may want to integrate over the curve 
defined by y = x 2 . Then we select some parametrization which is usually 
the most natural, in this case 


x = t, 


y = 


In general, if a curve is defined by a function y = g(x), we select the 
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parametrization 

x = t, y = g(t). 

For a circle of radius R centered at the origin, we select the 
parametrization 

x = R cos /, y = R sin t, 0 fS t ^ 2tt. 

whenever we wish to integrate counterclockwise. 

For a straight line segment between two points P and Q, we take the 
parametrization C given by 

C(0 = P + t(Q - P), O^t^X. 

The context should always make it clear which parametrization is 
intended. 

Remark 2. We may be given a finite number of ^-curves forming a 
path as indicated in the following figure: 



Thus formally, we define a (piecewise C 1 ) path C to be a finite sequence 
{Ci,, C m j, where each Q is a C 1 -curve, defined on an interval [o,-, />*-], 
such that the end point of Q is the beginning point of Q+i. Thus if 
Pi = Ci(cii) and Qi = Q(^), then 


Qi = P i+ i. 

We define the integral of F along such a path C to be the sum 

[ F = f F + [ F+ • • • + [ F. 

Jc JC\ Jc 2 Jc m 

We say that the path C is a closed path if the end point of C m is the begin¬ 
ning point of Cj. 

In the following picture, we have drawn a closed path such that the 
beginning point of Ci, namely Pi, is the end point of the path C 4 , which 
joins P 4 to Pi. 
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Example 3. Let F(x, y) = (x 2 , xy ) and let the path consist of the seg¬ 
ment of the parabola y = x 2 between (0, 0) and (1, 1), and the line seg¬ 
ment from (1,1) and (0, 0). (Cf. Fig. 4.) 



Then we let C\(t) = (/, t 2 ) and C 2 (/) = (1 — t, 1 — t). We let 
C = (Ci, C 2 ). To find the integral of F along C we find the integral 
along Ci and C 2 , and add these integrals. We get: 

[ F = f ‘ F(C,(l)) ■ (1, It) dt = l 1 (» 2 + 2 1 4 ) dt = i + §. 

Jc 1 Jo Jo 

f F= f l F(C 2 (t))-(-l,-l)dt = f 1 —2(1 — 2/ + t 2 )dt = — 

Jc 2 Jo Jo 

Hence 

/ c F=-4 + *. 

When the vector field F admits a potential function <p, then the integral 
of F along a curve has a simple expression in terms of <p. 

Theorem 5. Let F be a continuous vector field on the open set U and 
assume that F = grad <p for some differentiable function <p on U. Let 
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C be a C 1 -curve in U, joining the points P and Q. Then 

r f= f(Q) ~ ^p). 

Jp,c 

In particular, the integral of F is independent of the curve C joining 
P and Q. 

Proof Let C be defined on the interval [a, b ], so that C(a ) = P and 
C(b) = Q. By definition, we have 

[ Q F= [ b F(C(t)) • C'(/) dt = f b grad tp (C(/)) • C'(t) dt. 

JP,C Ja Ja 

But the expression inside the integral is nothing but the derivative with 
respect to t of the function g given by g(t ) = <p(C(t)), because of the chain 
rule. Thus our integral is equal to 

fV(0 * = g(b) - «(«) = - *(C(a)). 

Ja 


This proves our theorem. 


This theorem is easily extended to paths. We leave this to the reader. 

We observe that in physics, one may interpret a vector field F as 
describing a force. Then the integral of this vector field along a path C 
describes the work done by the force along this path. In particular, when 
the vector field is conservative, as in Theorem 5, the work is expressed 
in terms of the potential function for F, and the end points of the path. 


Example 4. Let F(X ) = kX/r z , where r = ||A r ||> ar *d k is a constant. 
This is the vector field inversely proportional to the square of the distance 
from the origin, used so often in physics. Then Fhas a potential function, 
namely the function <p such that <p(X ) = —k/r. Thus the integral of F 
from P = (1, 1, 1) to Q = (1, 2, — 1) is simply equal to 


<f(Q) - <P(P) = 



On the other hand, if P\, Q x are two points at the same distance from the 
origin (i.e. lying on the same circle, centered at the origin), then the 
integral of F from P\to Q\ along any curve is equal to 0. 


Example 5. Let C be a closed curve, whose end point is equal to the 
beginning point P. In Theorem 5, when the vector field F admits a poten¬ 
tial function <p, it follows that the integral of F over the closed curve is then 
equal to 0, because it is equal to 


<p(P) - <p(P) = 0 . 
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This allows us to give an example for a situation when a vector field 
F = (/, g ) satisfies the condition 

dy dx 

but F does not have a potential function. Let 

F (x, y ) = ( X 2 _} y y2 ’ *2 + ^ 2 ) 

be the vector field of Exercise 11. A simple computation, left as an 
exercise, shows that it satisfies the above condition. Compute the integral 
of F over the closed circle of radius 1, centered at the origin. You will 
find a value 9 ^ 0. This does not contradict Theorem 4, because the vector 
field is defined on the open set obtained from the plane by deleting the 
origin, i.e. the vector field is not defined at (0, 0). The open set has a 
“hole” in it (a pinhole, in fact). 


Exercises 

Compute the curve integrals of the vector field over the indicated curves. 

1. F(x, y) = (jr 2 — 2xy, y 2 — 2xy) along the parabola y = x 2 from (—2, 4) 
to (1, 1). 

2. (*, y, xz — y) over the line segment from (0, 0, 0) to (1, 2, 4). 

3. Let r = ( x 2 + y 2 ) 1/2 . Let F(X) = r~ x X. Find the integral of Fover the 
circle of radius 2, taken in counterclockwise direction. 

4. Let C be a circle of radius 20 with center at the origin. Let F be a vector 
field such that F(X) has the same direction as X. What is the integral of 
F around Cl 

5. What is the work done by the force F(x, y) = (jc 2 — y 2 , 2xy) moving a 
particle of mass m along the square bounded by the coordinate axes and 
the lines x = 3, y = 3 in counterclockwise direction? 

6. Let F(x, y) = (cxy, x C) y 2 ), where c is a positive constant. Let a, b be num¬ 
bers > 0. Find a value of a in terms of c such that the line integral of F 
along the curve y = ax b from (0, 0) to the line x = 1 is independent of b. 

Find the values of the indicated integrals of vector fields along the given 
curves in Exercises 7 through 13. 

7. (y 2 , —x) along the parabola x = y 2 /4 from (0, 0) to (1, 2). 

8. (a: 2 — y 2 , x) along the arc in the first quadrant of the circle x 2 + y 2 = 4 
from (0, 2) to (2, 0). 

9. (x 2 y 2 , xy 2 ) along the closed path formed by parts of the line x = 1 and 
the parabola y 2 = x, counterclockwise. 
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10. ( x 2 — y 2 , *) counterclockwise around the circle x 2 + y 2 = 4. 

11. (a) The vector field 

\* 2 + y 2 x 2 + y 2 ) 

counterclockwise along the circle x 2 + y 2 — 2 from (1,1) to (—\/2,0). 

(b) The same vector field counterclockwise around the whole circle. 

(c) Around the circle * 2 + y 2 = l. 

(d) Around the circle x 2 + y 2 = r 2 . 

(e) Verify that for this vector field, we have df/dy = dg/dx. For a con¬ 
tinuation of this train of thought, see Green’s theorem. 


12. The same vector field along the line x + y = 1 from (0,1) to (1, 0). 


13. (2xy, — 3xy) clockwise around the square bounded by the lines x = 3, 
x = 5, y = 1, y = 3. 


14. Let C = (Ci,..., C m ) be a piecewise C^path in an open set U. Let F 
be a continuous vector field on U, admitting a differentiable potential func¬ 
tion <p. Let P be the beginning point of the path and Q its end point. Show 
that 



F = <p(Q) - <p(P). 


[Hint: Apply Theorem 5 to the beginning point Pi and end point £>; for 
each curve C,.] 


15. Find the integral of the vector field F(x, y, z) = (2x, 3y, 4z) along the 
straight line C(t) = (t, t, t ) between the points (0, 0, 0) and (1,1,1). 

16. Find the integral of the vector field F(x, y, z) — (y + z, x + z, x + y) 
along the straight line C(t) = ( t, t, t) between (0, 0, 0) and (1, 1, 1). 

17. Find the integral of the vector field given in Exercises 15 and 16 between 
the given points along the curve C(t) = ( t , t 2 , t 4 ). Compare your answers 
with those previously found. 

18. Let F(x, y, z) = ( y , x, 0). Find the integral of F along the straight line 
from (1, 1,1) to (3, 3, 3). 


19. Let P, Q be points of 3-space. Show that the integral of the vector field 
glven by F(x, y, z) = (z 2 , 2y, 2xz) 


from P to Q is independent of the curve selected between P and Q. 

20. Let F(x, y) = (x/r :i , y/r 3 ) where r = (x 2 + y 2 ) 1/2 . Find the integral of 
F along the curve C(t) = (e l cos t, e l sin t) from the point (1,0) to the 
point (e 2x , 0). 

21. Let F(x,y,z ) = (z 3 y, z 3 x, 3z 2 xy). Show that the integral of F between 
two points is independent of the curve between the points. 

22. Let F(x, y) = (x 2 y, xy 2 ). 

(a) Does this vector field admit a potential function? 

(b) Compute the integral of this vector field from O to the point P indicated 
on the figure, along the line segment from (0,0) to (1/V2, l/\/2). 
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(c) Compute the integral of this vector field from O to P along the path 
which consists of the segment from (0, 0) to (1,0), and the arc of circle 
from (1, 0) to P. 

Dependence of the integral on the path 

By a path from now on, we mean a piecewise C x -path, and all vector fields 

are assumed continuous. 

Given two points P, Q in some open set U, and a vector field F on U, 
it may be that the integral of F along two paths from P to Q depends on 
the path. The main theorem of this section gives three equivalent condi¬ 
tions that this integral should be independent of the path. Before dis¬ 
cussing this theorem, we describe what we mean by integrating along a 
curve in opposite direction. 

Let C: [a, h] —» R n be a curve. We define the opposite curve C~ (or the 
negative curve) by letting 

C~(t) = C(a + b - /). 

Thus when t = b we find that C~(b ) = C(a), and when t = a we find 
that C~(a) = C(b). As t increases from a to b, we see that a + b — t 
decreases from b to a and thus we visualize C~ as going from C(b) to 
C(a ) in reverse direction from C (Fig. 5). 



Figure 5 


Lemma. Let F be a vector field on the open set U, and let C be a curve 
in U, of class C 1 , defined on the interval [a, b\ Then 

/ F = ~ / F ‘ 

Jc~ Jc 
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Proof. This is a simple application of the change of variables formula. 
Let u = a + b — t. Then du/dt = — 1. By definition and the chain rule, 
we get: 

L f =I F ^) d -w d ‘ 

F(C(a + b - /)) • O(a + b - t)(~ 1) dt. 

We now change variables, with du = —dt. When t = a then u = b, and 
when t = b then u = a. Thus our integral is equal to 

F(C(u )) • C'(u ) du = — f b F(C(u )) • C'(«) 

Ja 

thereby proving the lemma. 

The lemma expresses the expected result, that if we integrate the vector 
field along the opposite direction, then the value of the integral is the 
negative of the value obtained by integrating F along the curve itself. 

Theorem 6. Let U be an open set in R” and let F be a vector field on U. 
Assume that any two points of U can be connected by a path in U. Then 
the following conditions are equivalent: 

(i) The vector field F has a potential function. 

(ii) The integral of F along any closed path in U is equal to 0. 

(iii) If P, Q are two points in U then the integral of F from P to Q is 
independent of the path. 

Proof. Assume condition (i). Let C — (C 1} , C m ) be a path in U 
where each CV is a C^-curve. Let Pi be the beginning point of C, and let 
Qi be its end point, so that Qi = P i+1 . By Theorem 5, we find that 

[ F = <p(Q m ) ~ + <p(Qm— l) — <p(Pm— l) + * ’ * + <f(Ql ) ~ 

Jc 

All intermediate terms cancel, leaving the first and the last terms, and our 
integral is equal to 

If the path is a closed path, then Q m = P i and thus the integral is equal 
to 0. If Px = P and Q m = Q, then we see that the value of the integral 
is independent of the path; it depends only on P, Q and the potential 
function, namely <p(Q ) — <p(P). Thus both conditions (ii) and (iii) follow 
from (i). 

Furthermore, condition (ii) implies (iii). Indeed, let C and D be paths 
from P to Q in U. Let D = (D u . .., D k ) where each Dj is a C^-curve. 
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Then we may form the opposite path 


and by the lemma, 


D — (A , • • • , A ), 




F. 


If c = (Cl, . . . , C m ), then the path (C u . . . , C m , D^T, . . . , D\) is a 
closed path from P to P (Fig. 6), and assuming (ii), we conclude that the 
integral of F along this closed path is equal to 0. Thus 

[ F+ [ _F= 0. 

JC JD 

From this it follows that 



F, 


whence condition (iii) holds. 



We shall now prove that condition (iii) implies (i). Let P 0 be a fixed 
point of U and define a function <p on U by the rule 

v(P) = f P F, 

Jp 0 

where the integral is taken along any path from P 0 to P. By assumption, 
this integral does not depend on the path, so we don’t need to specify 
the path in the notation. We must show that the partial derivatives 
D { tp(P) exist for all P in U, and if the vector field F has coordinate functions 

F = (/i, . .. 


then D iV (P) = UP). 

To do this, let E { be the unit vector with 1 in the i-th component and 0 
in the other components. Then for any vector X = (xi, . .. , x„) we 
have X • Ei = x t . To determine Di<p(P) we must consider the Newton 
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quotient 


<f(P + hEj) - y(P) 
h 



and show that its limit as h —* 0 is/,(/>). The integral from P 0 to P + hEi 
can be taken along a path going first from P 0 to P and then from P to 
P hEi. 



We can then cancel the integrals from P 0 to P and obtain 

c P+hE , 

<p(P + hEj ) ~ <p(P) = Ip F(C ) • dC 

h h 

taking the integral along any curve C between P and P + hE ,. In fact, 
we take C to be the parametrized straight line segment given by 

C(t) = P + tEi 

with 0 ^ ^ h in case h is positive. (The case of h negative is handled 

similarly. Cf. Exercise 1.) Then C'(/) = Ei and 

f(c(i)) ■ c'(<) = Mm)- 

The Newton quotient is therefore equal to 

Jo Mm) d > 

h 

By the fundamental theorem of calculus, for any continuous function g 
we have (cf. Remark after the proof): 

rh 

lim \ / g(t) dt = g(0). 

h ->0 n JO 

We apply this to the function given by g(t) = Then 

g(0) = fi(C( 0)) = UP + 0 E { ) = UP)- 
Therefore we obtain the limit 

iim _ /i(P) . 
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This proves what we wanted. 

Remark. The use of the fundamental theorem of calculus in the pre¬ 
ceding proof should be recognized as absolutely straightforward. If G 
is an indefinite integral for g, then 

I" 8(0 dt = G(h) - G(0), 

Jo 

and hence 

is the ordinary Newton quotient for G. The fundamental theorem of 
calculus asserts precisely that the limit as h —> 0 is equal to (7(0) = g(0). 


Exercise 

1. To take care of the case when h is negative in the proof of Theorem 6, use 
the parametrization C(t) = P + thEi with 0 ^ t ^ 1. Making a change 
of variables, u = th, du = h dt, show that the proof follows exactly the 
same pattern as that given in the text. 




CHAPTER VI 


Higher Derivatives 


In this chapter, we discuss two things which are of independent interest. 
First, we define partial differential operators (with constant coefficients). 
It is very useful to have facility in working with these formally. 

Secondly, we apply them to the derivation of Taylor’s formula for 
functions of several variables, which will be very similar to the formula 
for one variable. The formula, as before, tells us how to approximate a 
function by means of polynomials. In the present theory, these poly¬ 
nomials involve several variables, of course. We shall see that they are 
hardly more difficult to handle than polynomials in one variable in the 
matters under consideration. 

The proof that the partial derivatives commute is tricky. It can be 
omitted without harm in a class allergic to theory, because the technique 
involved never reappears in the rest of this book. 

§1. Repeated partial derivatives 

Let/be a function of two variables, defined on an open set U in 2-space. 
Assume that its first partial derivative exists. Then D±f (which we also 
write df/dx if x is the first variable) is a function defined on U. We may 
then ask for its first or second partial derivatives, i.e. we may form D 2 D x f 
or D\D\f if these exist. Similarly, if D 2 f exists, and if the first partial 
derivative of D 2 f exists, we may form D x D 2 f. 

Suppose that we write / in terms of the two variables (x, y ). Then we 


can write 


DM.y) - £(f 

£) = (£.(£ 2 /))(x, y), 

and 



;) = (D 2 (DJ))(x, y). 


For example, let f(x, y ) = sin(;c}>). Then 


$x = y C0S (*T) and ^ = * cos(xy). 
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Hence 

fix,y) = — jtysin(xy) + cos(xj). 

But differentiating df/dy with respect to x, we see that 

DiD 2 f(x,y) = —xy sin(xy) + cos(^). 

These two repeated partial derivatives are equal! 

The next theorem tells us that in practice, this will always happen. 

Theorem 1. Let f be a function of two variables, defined on an open 
set U of 2-space. Assume that the partial derivatives D x f, D 2 f, D 1 D 2 f, 
and D 2 Dyf exist and are continuous. Then 

D\D 2 f = D 2 D x f 

Proof. A direct use of the definition of these partial and repeated 
partial derivatives would lead to a blind alley. Hence we shall have to 
use a special trick to pull through. 

Let (x, y) be a point in U, and let H = (h, k ) be small, h ^ 0, k ^ 0. 
We consider the expression 

g(x) = fix, y + k) - f(x, y). 

If we apply the mean value theorem to g, then we conclude that there 
exists a number Si between x and x + h such that 

g(x + h) - g(x) = g'is^h, 

or in other words, using the definitions of partial derivative: 

(1) g(x + h) - gix ) = [Dtfisuy + k) - D x fis x ,y)]h. 

But the difference on the left of this equation is 

(2) fix + h,y + k) - fix + h,y) - f(x, y + k) + fix, y). 

On the other hand, we can now apply the mean value theorem to the 
expression in brackets in (1) with respect to the second variable. If we do 
this, we see that the long expression in (2) is equal to 

(3) DjDxfjsy, s 2 )kh 

for some number s 2 lying between y and y + k. 

We now start all over again, and consider the expression 

g 2 iy ) = fix + h,y) - fix, y). 

We apply the mean value theorem to g 2 , and conclude that there is a 
number t 2 between y and y + k such that 

g2(y + k) - g 2 0) = g2(t 2 )k, 
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or in other words, is equal to 

(4) [D 2 f(x + h, t 2 ) - D 2 f(x, t 2 )]k. 

If you work out g 2 (y + k) — g 2 (y), you will see that it is equal to the 
long expression of (2). Furthermore, proceeding as before, and applying 
the mean value theorem to the first variable in (4), we see that (4) becomes 

(5) D x D 2 f(J\, t 2 )hk 

for some number t x between x and x + h. Since (5) and (3) are both 
equal to the long expression in (2), they are equal to each other. Thus 
finally we obtain 

D 2 D x f(s x , s 2 )kh = D x D 2 f(t ls t 2 )hk. 

Since we assume from the beginning that h 0 and k ^ 0, we can cancel 
hk, and get 

DzD x f(s x ,s 2 ) = D x D 2 f(t x ,t 2 ). 

Now as h, k approach 0, the left side of this equation approaches 
D 2 D x f(x,y) because D 2 D x f is assumed to be continuous. Similarly, 
the right-hand side approaches D x D 2 f(x, ;>). We can therefore conclude 
that 

D x D 2 f(x,y ) = D 2 D x f(x,y), 

as desired. 

Consider now a function of three variables f(x, y, z ). We can then 
take three kinds of partial derivatives: D x , D 2 , or D z (in other notation, 
d/dx, d/dy, and d/dz). Let us assume throughout that all the partial 
derivatives which we shall consider exist and are continuous, so that we 
may form as many repeated partial derivatives as we please. Then using 
Theorem 1, we can show that it does not matter in which order we take 
these partials. 

For instance, we see that 

D%D X f — D X D z f. 

This is simply an application of Theorem 1, keeping the second variable 
fixed. We may take a further partial derivative, for instance 

D x D z D x f. 

Here D x occurs twice and Z) 3 once. Then this expression will be equal to 
any other repeated partial derivative of /in which D x occurs twice and 
D 3 once. For example, we apply the theorem to the function (D x f). 
Then the theorem allows us to interchange D x and D z in front of ( D x f ) 
(always assuming that all partials we want to take exist and are con- 
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tinuous). We obtain 

D x D 3 (D x f) = D 3 D x (D x f). 
As another example, consider 
(6) D 2 D x D 3 D 2 f. 


We wish to show that it is equal to D x D 2 D 2 D 3 f. By Theorem 1, we have 
D 3 D 2 f = D 2 D 3 f Hence: 

(7) D 2 D x (D 3 D 2 f ) = D 2 D x (D 2 D 3 f). 

We then apply Theorem 1 again, and interchange D 2 and D x to obtain 
the desired expression. 

In general, suppose that we are given three positive integers m x , m 2 , 
and m 3 . We wish to take the repeated partial derivatives of f by using 
m i times the first partial D x , using m 2 times the second partial D 2 , and 
using m 3 times the third partial D 3 . Then it does not matter in which 
order we take these partial derivatives, we shall always get the same 
answer. 

To see this, note that by repeated application of Theorem 1, we can 
always interchange any occurrence of D 3 with D 2 or D x so as to push 
D 3 towards the right. We can perform such interchanges until all occur¬ 
rences of D 3 occur furthest to the right, in the same way as we pushed 
D 3 towards the right going from expression (6) to expression (7). Once 
this is done, we start interchanging D 2 with D x until all occurrences of 
D 2 pile up just behind D 3 . Once this is done, we are left with D x repeated 
a certain number of times on the left. 

No matter with what arrangement of D x , D 2 , D 3 we started, we end up 
with the same arrangement, namely 


D , 


D i D, 


Do D; 


D 3 f, 


m i 


m 2 


m 3 


with D i occurring m x times, D 2 occurring m 2 times, and Z> 3 occurring 
m 3 times. 

Exactly the same argument works for functions of more variables. 


Exercises 

In all problems, functions are assumed to be differentiable as needed. 

Find the partial derivatives of order 2 for the following functions and verify 
explicitly in each case that D\Dif = D 2 D x f. 

1. e xy 2. sin(xy) 
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3. x 2 y 3 -f- 3 xy 
5. e x2+y2 
7. cos(;c 3 + xp) 
9. e x+y 


4. 2xy -f y 2 
6. sin(;c 2 + p) 

8. arctan(;c 2 — 2 xy) 
10. sin(;c -f- p)- 


Find DiD 2 D z f and D%D 2 Dif in the following cases. 

11. xyz 12. x 2 yz 

13. e xyz 14. sin(xpz) 

15. cos(;c + p + z) 16. sin(x + p + z) 

17. ( x 2 + p 2 + z 2 ) -1 18. x 3 y 2 z + 2(x + P + z). 

19. Let x = r cos d and y = r sin d. Let f(x, p) = g(r, d). Show that 

df dg sin d dg 

dx dr r dd 

df . a dg cos 6 dg 

r y = sme dr + ~df 

dg dg df df 

[Hint: Using the chain rule, find first — and — in terms of — and — • Then 

dr dd dx dy 

solve back a system of two equations in two unknowns.] 

20. Let x = r cos d and p = r sin d. Let f(x, p) = g(r, d). Show that 

4_ i d A , 1 d ll = , tl. 

dr 2 + r ar + r2 dd 2 dx 2 dy2 

21. Let f(X) = g(r) (with r = ||A r ||), and assume X = (jc, p, z). Show that 

dr 2 ^ r dr dx 2 T dy 2 T dz 2 

22. Let f(x,y) satisfy f(tx, ty ) = r"/(;t, p) for all t ( n being some integer ^ 1). 
Show that 

df df . . 

x r x + y Fy- n/(x ’ y) ■ 

23. Let / be as in Exercise 22. Show that 

+ 2 xy i£+ y 2 U~ n( " ~ mx,y) ■ 

(It is understood throughout that all functions are as many times differ¬ 
entiable as is necessary.) 

24. A function of three variables / (x, y, z) is said to satisfy Laplace’s equation if 

a 2 / a 2 / a 2 / „ 

dx 2 ^ ap2 T az2 
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Verify that the following functions satisfy Laplace’s equation. 

(a) x 2 + y 2 — 2z 2 (b) log \/x 2 + y 2 

(c) — .r. (d) e ' ix+4y cos(5z) 

\/x 2 -j- y 2 -j- z 2 

25. Let z = f(u, v) and u = x + y,v = x — y. Show that 


a 2 -2 
a z _ d z d z 

dx dy du 2 dv 2 

26. Let z = f(x + y) — g(x — y), where /, g are functions of one variable. 
Let u = x + y and v = x — y. Show that 


a 2 a 2 

a z a z 
dx 2 dy 2 


/"(«) - g"ip)- 


27. Let c be a constant, and let z = sin(A: + ct ) + cos(2x + 2 ct). Show that 


a 2 a 2 

d z _ 2 o z 
di 2 ~ C dx 2 ' 

28. Let z = • Show that x(dzjdx) + y(dz/dy) = 0. 

29. Let c be a constant, and let z = f(x + ct) + g(x — ct). Let u = x + ct 
and v = x — ct. Show that 


30. Let F be a vector field on an open set in 3-space, so that F is given by three 
coordinate functions, say F = (/ 1 ,/ 2 ,/cf). Define the curl of F to be the 
vector field given by 


(curl F) (xi, x 2 , * 3 ) 


= ( d J±- 

\d*2 


3/2 


dj\ 

3*3 


3/3 3/2 


dx 3 5x3 3xi 3xi 
Define the divergence of F to be the function g — div F given by 


dfi 

3x2 


;) 


g(x , y, z) 


3/i 3/2 3/3 

dx dy dz 


(a) Prove that 


div curl F = 0. 


(b) Prove that curl grad = O, for any function <p. 
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§2. Partial differential operators 


We shall continue the discussion at the end of the last section, but we 
shall build up a convenient system to talk about iterated partial 
derivatives. 

For simplicity, let us begin with functions of one variable x. We can 
then take only one type of derivative, 


Let / be a function of one variable, and let us assume that all the iterated 
derivatives of/ exist. Let m be a positive integer. Then we can take the 
m-th derivative of/, which we once denoted by f {m) . We now write it 


DD • * * Df or 



the derivative D (or d/dx ) being iterated m times. What matters here is 
the number of times D occurs. We shall use the notation D m or ( d/dx) m 
to mean the iteration of D, m times. Thus we write 


D "'f or (fff 

instead of the above expressions. This is shorter. But even better, we 
have the rule 

jy n D n f = Z)w*+ny 

for any positive integers m, n. So this iteration of derivatives begins to 
look like a multiplication. Furthermore, if we define D°f to be simply /, 
then the rule above also holds if m, n are ^ 0. 

The expression D m will be called a simple differential operator of order m 
(in one variable, so far). 

Let us now look at the case of two variables, say (x, j). We can then 
take two partials D i and D 2 (or d/dx and d/dy). Let mi, m 2 be two 
integers ^ 0. Instead of writing 


D\ • • • D\D 2 • • • D 2 f or 

^-v-^-v-^ 

m i m 2 



we shall write 

an* T* 


D™ l D™ 2 f 


or 
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For instance, taking m\ = 2 and w 2 = 5 we would write 

D\D\f. 

This means: take the first partial twice and the second partial five times 
(in any order). (We assume throughout that all repeated partials exist 
and are continuous.) 

An expression of type 

DTDV 


will be called a simple differential operator, and we shall say that its order 
is m\ + m 2 . In the example we just gave, the order is 5 + 2 = 7. 

It is now clear how to proceed with three or more variables, and it is 
no harder to express our thoughts in terms of n variables than in terms of 
three. Consequently, if we deal with functions of n variables, all of whose 
repeated partial derivatives exist and are continuous in some open set U, 
and if D u . . ., D n denote the partial derivatives with respect to these 
variables, then we call an expression 

A? 1 • • • AT" or 



a simple differential operator, m x ,. . . , m n being integers ^ 0. We say that 
its order is m x + • • • + m n . 

Given a function/(satisfying the above stated conditions), and a simple 
differential operator D, we write Df to mean the function obtained from 
/by applying repeatedly the partial derivatives Di, , D n , the number 
of times being the number of times each A occurs in D. 


Example 1. Consider functions of three variables (x, y, z). Then 


is a simple differential operator of order 3 + 5 + 2 = 10. Let / be a 
function of three variables satisfying the usual hypotheses. To take Df 
means that we take the partial derivative with respect to z twice, the 
partial with respect to y five times, and the partial with respect to x three 
times. 

We observe that a simple differential operator gives us a rule which to 
each function / associates another function Df. 

As a matter of notation, referring to Example 1, one would also write 
the differential operator D in the form . 

,,10 

D = —-- 

dx 3 dy 5 dz 2 
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In this notation, one would thus have 



and 



a 2 / 

dx 2 

a 2 / 

dx dy 


All the above notations are used in the scientific literature, and this is the 
reason for including them here. 

Warning. Do not confuse the two expressions 



which are usually not equal. For instance, if f(x,y ) = x 2 y, then 

g=2y and (f£) = 4x*A 

We shall now show how one can add simple differential operators and 
multiply them by constants. 

Let D, D' be two simple differential operators. For any function / we 
define (D + D')f to be Df + Df. If c is a number, then we define ( cD)f 
to be c(Df). In this manner, taking iterated sums, and products with 
constants, we obtain what we shall call differential operators. Thus a 
differential operator D is a sum of terms of type 


CDT ■ ■ • 7>n”, 

where c is a number and are integers ^ 0. 

Example 2. Dealing with two variables, we see that 




(S' " ' 


d_ d_ 
dx dy 


Df(x,y) = 3^+5 


is a differential operator. Let f(x, y) = sin(xj’). We wish to find Df. 
By definition, 

(±Y f -» d i 

\dx) ' dx dy 
= 3}>cos(.xy) + 5(— y 2 sin(xj>)) 

— 7r[y(— sin(Aj))x + cos(xj)]. 

We see that a differential operator associates with each function / 
(satisfying the usual conditions) another function Df. 



118 


HIGHER DERIVATIVES 


[VI, §2] 


Let c be a number and / a function. 

Then W) - 


Let Di be any partial derivative. 
cDJ. 


This is simply the old property that the derivative of a constant times a 

function is equal to the constant times the derivative of the function. 

Iterating partial derivatives, we see that this same property applies to 

differential operators. For any differential operator D, and any number c, 

we have n 

D(cf) = cDf. 


Furthermore, if/, g are two functions (defined on the same open set, 
and having continuous partial derivatives of all orders), then for any 
partial derivative Du we have 


A(/ + g) — Dif + Dig. 


Iterating the partial derivatives, we find that for any differential operator 
D, we have 

D(f+g) = Df+ Dg. 

Having learned how to add differential operators, we now learn how to 
multiply them. 

Let D, D' be two differential operators. Then we define the differential 
operator Dp' to be the one obtained by taking first D' and then D. In 
other words, if/ is a function, then 

(DD')f = D(D'f). 

Example 3. Let 



Differential operators multiply just like polynomials and numbers, and 
their addition and multiplication satisfy all the rules of addition and 
multiplication of polynomials. For instance: 

If D, D' are two differential operators, then 

DD' = D'D. 

If D, D', D" are three differential operators, then 

D(D' + D") = DD' + DD". 
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It would be tedious to list all the properties here and to give in detail 
all the proofs (even though they are quite simple). We shall therefore 
omit these proofs. The main purpose of this section is to insure that you 
develop as great a facility in adding and multiplying differential operators 
as you have in adding and multiplying numbers or polynomials. 

When a differential operator is written as a sum of terms of type 

cDT 1 ■ • • 

then we shall say that it is in standard form. 

For example, 

3 (s) + 14 ^! + 8 (!) 

is in standard form, but 

( 3 £ + 2 |)(£ + 4 !;) 

is not. 

Each term 

cD™ x • • • D™ n 

is said to have degree mi + • • • + m n . If a differential operator is 
expressed as a sum of simple differential operators which all have the 
same degree, say m, then we say that it is homogeneous of degree m. 

The differential operator of Example 2 is not homogeneous. The differ¬ 
ential operator DD' of Example 3 is homogeneous of degree 2. 

An important case of differential operators being applied to functions 
is that of monomials. 

Example. Let f(x,y) = x s y 2 . Then 

Dif(x,y) = 3*y, D\f(x,y) = 2-3 xy 2 

Dif(x, y) = 6/, D\f(x, y) = 0. 

Also observe that 

D\Dlf(x,y) = 3!2! 

Example. The generalization of the above example is as follows, and 
will be important for Taylor’s formula. Let 

f{x, y) = xy 

be a monomial, with exponents /', j ^ 0. Then 

D\Dif(x,y ) = i\j\. 

This is immediately verified, by differentiating x l with respect to x, i 
times, thus getting rid of all powers of x: and differentiating y } with respect 
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to y, j times, thus getting rid of all powers of y. 

On the other hand, let r, s be integers ^ 0 such that i r or j ^ s. 
Then 


D\ D s 2 f (0, 0) = 0. 

To see this, suppose that r ^ i. If r > i, then differentiating r times the 
power x l yields 0. If r < i, then differentiating r times the power x l yields 

i(i — 1) •••(/ — r + \)x z ~ r , 

and i — r > 0. Substituting x = 0 yields 0. The same argument works 

if j 5 ^ 5 . 


Exercises 


Put the following differential operators in standard form. 


1. (3D! + 2D 2 ) 2 
3. (Di - D 2 )(Di + D 2 ) 

5. (Di + D 2 ) 3 

7. (2Di - 3D 2 )(Di + D 2 ) 



2. (Di + D 2 + D 3 ) 2 
4. (Di + D 2 ) 2 
6. (Di + D 2 ) 4 
8. (Di - D 3 )(D 2 + 5D 3 ) 



Find the values of the differential operator of Exercise 10 applied to the 
following functions at the given point. 

13. x 2 y at (0,1) 14. xy at (1, 1) 


15. sin (xt) at (0,7r) 

17. Compute D\D\f(x, y) if f(x, j) is 
(a) x 5 y 4 

(c) x 4 y 3 

18. Compute D\dI/(0, 0) if f(x, y) is 
(a) x 8 y 7 

(c) 11 x 7 y 9 


16. e x » at (0, 0) 


(b) x 4 y 2 
(d) 10 x 4 t 3 


(b) 3 x 7 y 9 
(d) 25 x 6 y n 


19. Let f(x,y) — 3 x 2 y + 4 x s y 4 — 7 x 9 y 4 . Find 

(a) D 3 D 4 /(0, 0) (b) DjD|/(0, 0) 

(c) D\ D 2 /(0, 0) (d) D 3 D 2 /(0, 0) 

20. Let/(x, y, z) — 4 x 2 yz' 6 — 5x 3 t 4 2 + 7x 6 t 10 z 7 . Find 

(a) D\ D 2 D 2 /(0, 0, 0) (b) D 2 D 2 D 3 /(0, 0, 0) 

(c) D\D l 2 °D 7 f(0, 0, 0) (d) DfD 2 D 3 /(0,0,0) 
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21. Let /, g be two functions (of two variables) with continuous partial deriva¬ 
tives of order ^ 2 in an open set U. Assume that 

df = dg, 
dy dx 

0 . 

22. Let/ be a function of three variables, defined for X O byf(X) = 1/|| X\\. 
Show that 

«V+«V + aV_ o. 

dx 2 ^ dy 2 ^ dz 2 

23. In Exercise 20 of the preceding section, compute 



in terms of d/dr and d/dd. Watch out! The coefficients are not constant. 


df 

dx 


Show that 


dg 

dy 


and 


dx 2 dy 2 


§3. Taylor’s formula 

In the theory of functions of one variable, we derived an expression 
for the values of a function / near a point a by means of the derivatives, 
namely 


f(a + A) = f(a ) + f'(a)h + A 2 + 


r-"(a) h ,-i , „ 

+ (r _ i)i h + R 


T> 


where 




for some point c between a and a + h. We shall now derive a similar 
formula for functions of several variables. We begin with the case of two 
variables. 

We let P = (a, b ) and H = ( h , k ). We assume that P is in an open 
set U and that / is a function on U all of whose partial derivatives up to 
order n exist and are continuous. We are interested in finding an expression 

f(P +//)=/(/>) + ??? 

The idea is to reduce the problem to the one variable case. Thus we define 
the function 

g(t) = f(P + tH) = /(a + th, b + tk) 


for 0 ^ ^ 1. We assume that U contains all points P + tH for 
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0 g / g 1. Then 

g( 1) = f(P + H) 


and g(0) = /(/»). 


We can use Taylor’s formula in one variable applied to the function g 
and we know that 

• lr ~ I> (0) , g m (T) 


g(!) = g(0) + + 


+ 


s_ 

(r - 1)! 


+ 


r\ 


for some number r between 0 and 1, provided that g has r continuous 
derivatives. We shall now prove that the derivatives of g can be expressed 
in terms of the partial derivatives of/, and thus obtain the desired Taylor 
formula for f. We shall first do it for n = 2. 

We let x = a + th and y = b + tk. By the chain rule: 


( 1 ) 


g . (t) = dfdx dfdy^ 
* K J dx dt ^ dy dt 


+ ¥*- 

dx dy 


For the second derivative, we must find the derivative with respect to t 
of each one of the functions df/dx and df/dy. By the chain rule applied 
to each such function, we have: 

d (df\ d 2 f dx d 2 f dy ^ d 2 f d 2 f 
dt \d.x / dx 2 dt dy dx dt dx 2 dy dx ’ 

(2) d (°A tL /, I tlv 

dt\dy) dxdy ^dy 2 

Hence using (2) to take the derivative of (1), we find: 


g"( 0 




+ k 


d 2 f 
\_dx dy 


h + 



= h 2 —^ + 2 hk —. i + k 2 


d 2 f 


d 2 f 


dxdy ' " dy 2 

This expression can be rewritten more easily in terms of differential 
operators, namely we see that the expression for g"(t) is equal to 



/• 


If we wish to free the notation from the a and y, then we can use the 
notation 

g"(0 = (hD l + kD 2 ) 2 f(P + tH) 

= (hDi + kD 2 ) 2 f(a + th, b + 

T 

As usual, this means that we apply (hD i + kD 2 ) 2 to/, and then evaluate 
this function at the point (a + th,b + tk). 
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The expression hD i + kD 2 looks like a dot product, and thus it is 
useful to abbreviate the notation and write 

hD i + kD% = H V._ 

With this abbreviation, our first derivative for g can then be written 
[from (1)1: 

g'(0 = (H • V)/(/» + ///), 
and the second derivative can be written 

g"<7) = (H ■ V) 2 /(/> + tH ). 

Here again, we emphasize that (// • V) and (// ■ V) 2 are first applied to/, 
so that strictly speaking we should write an extra set of parentheses, e.g. 

g '(0 = ((# • V)/)(fl + /A, 6 + tk) 
and similarly for g"(0- 

The higher derivatives of g are determined similarly by induction. 

Theorem 2. Let r be a positive integer. Let f be a function defined on 
an open set U, and having continuous partial derivatives of orders ^ r. 
Let P be a point of U, and H a vector. Let g(t ) = f(P + tH). Then 

g M 0) = (<.H ■ V)'/)(/> + tH) 
for all values of t such that P + tH lies in U. 

Proof. The case r = 1 (even r = 2) has already been verified. Sup¬ 
pose our formula proved for some integer r. Let ^ = (H ■ V) r f Then 

g M (t) = f(P + tH). 

Hence by the case for r = 1 we get 

g<' +1 >(t) = ((H-V)t)(P+ tH). 

Substituting the value for ^ yields 

g' r+ “(t) = ((H ■ V)'+ '/)(/> + tH), 

thus proving our theorem by induction. 

In terms of the d/dx and d/dy notation, we see that 
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Taylor’s Formula. Let f be a function defined on an open set U, and 
having continuous partial derivatives up to order r. Let P be a point of 
U, and H a vector. Assume that the line segment 

P + tH, 0 ^ t ^ 1, 


is contained in U. Then there exists a number r between 0 and 1 such that 


f(P + H) = f(P) + 

+ 


yLjQm + ...+ 

1! ' ' 

(■ H • V)7(P + tH ) 
r\ 


( h- vy-'xp) 

(r - 1)! 


Proof This is obtained by plugging the expression for the derivatives 
of the function g{t) = f(P + tH ) into the Taylor formula for one vari¬ 
able. We see that 

g“(0) = (H ■ vy/(P) 

and 

g w (T) = ( h ■ vy/(p + tH). 

This proves Taylor's formula as stated. 

Rewritten in terms of the d/dx and d/dy notation, we have 


f(a + h,b + k) = f(a, b) + (h — + k 


(' 


+ ^f X + k ly. 


(' 

(' 


+ ^ h f x + k 

The powers of the differential operators 


£)/<*»+•■■ 

,) r_l /(«, b) 

/(«'+ rh, b + rk). 


/, b , d\ s 
( h -f k — J 
\ dx dy] 


are found by the usual binomial expansion. For instance: 

( h l + k $ = h2 ^ + 2hk ^- + k 


2 B' 


dx 2 


(*£+*£)■-fey 


+ 3 hk‘ 


dy 2 

feyfe) 

©©•-fey 


dx dy 

+ 3 h 2 k ( f: 
2 


Example 1. Find the terms of degree ^ 2 in the Taylor formula for 
the function f(x, y) = log (1 + x + 2 y) at the point (2, 1). 
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We compute the partial derivatives. They are: 

/(2, 1) = log 5, 


Difix,y ) = 

1 + ■ 

l 

x + 2 y’ 

D 2 fix,y) = 

1 + . 

2 

x + 2y ’ 

D\fix, y) = 

" 0 

1 

+ * + 2y)2 ’ 

Dlfix,y ) = 

■ 0 

4 

+ JC + 2y) 2 ’ 

D 1 D 2 f(x, y) 

= — 

2 

(1 + x + 2y) 2 


DJ( 2,1) = \ = |{( 2 . *). 

1) = | = ^(2, 1), 
Dif<2, 1 ) = - 25 = g^2 

DlA 2 . 0 = -3 = 0 ( 2 .'), 


Hence 


/(2 + h, 1 + /c) — log 5 + ^ h + ^ k'j 

IL1^_ 4 

2! L 25 n 


+ 


25 


/i/c — 


— A: 2 

25 ^ 



In many cases, we take P = O and we wish to approximate fix, y ) by 
a polynomial in x, y. Thus we let H = (x, x). In that case, the notation 
d/dx and d/dy becomes even worse than usual since it is not entirely clear 
in taking the square 



what is to be treated as a constant and what is not. Thus it is better to 
write 

(xD l + yD 2 f, 

and similarly for higher powers. We then obtain a polynomial expression 
for/, with a remainder term. The terms of degree ^ 3 are as follows: 

fix, y) 

= /(0, 0) + 0, 0)x + Z) 2 /(0, 0 )y 

+ i [2>f/(0,0)x 2 + 22> 1 Z> 2 /(0,0 )xy + Dlf(0, 0)/] 

+ jr [i>i/(0,0)x 3 + 2D\D 2 f(0 ,0 )x 2 y + 2D t Dlf(Q, 0 )xy 2 + Z>|/(0,0)/] 

+ 


In general, the Taylor formula gives us an expression 


fix, y) = /(0, 0) + Gfx, y) + • • • + G r _ i(x, y) + R„ 



126 


HIGHER DERIVATIVES 


[VI, §3] 


where G d (x , y ) is a homogeneous polynomial in x , y of degree d, and R r is 
the remainder term. We call 

/(0, 0) + Gi(x, y) + • • • + G s (x, y) 

the polynomial approximation of /, of degree fS s. 

We write polynomials in one variable as sums 

n 

X C t X l = Co + C X X + • • • + C n X n . 
i—0 

In a similar way, we can write polynomials in several variables, 

n m 

G(x, y) = X X CijX l y\ 

i—o y==o 

Let r, 5 be a pair of integers ^ 0. Then 


D\D 8 2 G(0,0) = r\s\c rs , 

by the example at the end of §2. Hence we have a simple expression for 
the coefficients of the polynomial. 


Cij 


D\D 3 2 G( 0, 0) 
i\j\ 


On the other hand, from the binomial expansion 

(*Z>, + yD 2 r = E (") *y- 

1 = 0 V / 

and the value of the binomial coefficient, 

( m\ _ ml 

i) i\(m — i)l 


we find that 


(xDi + yDtT v' 


S A y n l r\ m — i 

——-— u 2 


ml 

Consequently, 

{xD x + yD 2 ) m f( 0,0) 


m 


ml 


= X dj* l y 3 = >0 

1=0 


is a polynomial in x, j;, such that i -f- j = w, so all its monomials have 




[VI, §3] 


TAYLOR’S FORMULA 


127 


the same degree, and the coefficients are given by 


(*) 



The general Taylor polynomial of degree ^ s is therefore of the form 

G(x, y) = £ Cijx'y 3 , 

t’+J—S 

where the coefficients Cij are given by the above formula (*). Again, 
the example at the end of §2 shows that the partial derivatives up to total 
order s of this polynomial coincide with the derivatives of/, when eval¬ 
uated at (0, 0). Thus we may say: 

The Taylor polynomial of a function f up to order s is that polynomial 
having the same partial derivatives as the function up to order s, when 
evaluated at (0, 0). 

Example 2. Find the polynomial approximation of the function 

f ix, y) = log (1 + x + 2 y) 

up to degree 2. 

We computed the partial derivatives in Example 1. For the present 
application, we have 


/(0, 0) = 0, 

Dif(0,0 ) = 1, Z> 2 /( 0,0) = 2, 

£?/(0,0) = -1, Dlf{ 0,0) = -4, 

DiD 2 f(0, 0) = -2. 

Hence the polynomial approximation of/ up to degree 2 is 
G(x , y) = x + 2y - ±(x 2 + 4xy + Ay 2 ). 

Example 3. In some special cases, there is a way of getting the poly¬ 
nomial approximation to the function more simply by using more di¬ 
rectly the Taylor formula for one variable. Consider for instance the 
function f(x, y) = sin(.xy). For any number u we know that 

sin u = u -f- Rs(u) 

where Rs(u) is the remainder of the Taylor formula for the sine function 
of one variable. From the First Course in Calculus, you should know 
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that this remainder term satisfies the estimate 

ImI 3 

l*3(«)l S IjL • 

Thus if we let u = xy, then we find that 

sin(xj) = xy + Rz(xy), 

and hence 

|sin(^cy) — xy\ ^ . 

Also we see that, for example, for y j* 0, we have 

sin(x>>) - xy R*(xy) 

7— = rr^' 

In particular, we get the estimate 


sin(xy) — xy 

< W 3 bf 

y 

= 3! 


From this we see that the limit of 

sin(xy) — xy 

y 

as (x, y) approaches (0, 0) is equal to 0. 

Example 4. We can find the polynomial approximation of the func¬ 
tion in Example 2 by this method. We know from the theory of the 
logarithm in one variable that 

log (1 + u) - u — %u 2 + terms of higher degree. 

Hence putting u = x -j- 2y, we find that 

log (1 + x + 2y) = x + 2y — %(x + 2 y) 2 + terms of higher degree 

and therefore the polynomial approximation of our function up to degree 
2 is given by 

G(x,y) = x + 2y - %(x 2 + 4 xy + 4 y 2 ), 

which is an easier way, rather than finding all the partials, because we 
could just use the theory of functions of one variable. 
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Finally, we observe that the treatment of functions of several variables 
follows exactly the same pattern. In this case, we let 


H (h i, . .., hn) 

and 

H • V = h\D\ + • • • + h n D n = hi +- • • • + h n -— • 

oX i dx n 

Not a single word need be changed in Theorems 2 and 3 to get Taylor’s 
formula for several variables. 


Exercises 


Find the terms up to order 2 in the Taylor formula of the following functions 
(taking P = O). 


1. sin(xy) 

2. cos(xy) 

3. log(l + xry) 

4. sin(;t 2 + y 2 ) 

5. e x+y 

6. cos(x 2 + y) 

7. (sin jc)(cos y) 

8. e x sin y 

9. x + xy + 2y 2 

sin(xy) 

10. Does approach a limit as (x,y) approaches (0,0)? If so, what 

limit? 



11. Same questions for 




*v 


- 1 


and 


log(l + * 2 + y 2 ) 

x 2 + y 2 


12. Same questions for 

cos(xy) — 1 
x 

13. Same questions for 

sin(xy) — xy 
x 2 y 


14. Find the terms up to order 3 in Taylor’s formula for the function e x cos y. 

15. What is the term of degree 7 in Taylor’s formula for the function 

x 3 - 2 xy 4 + (jc - 1)V°? 

16. Show that if f(x, y, z) is a polynomial in x, y, z, then it is equal to its own 
Taylor series, i.e. there exists an integer n such that R n = 0. 

17. Find the polynomial approximation of the function 

f(x, y) = log(l + x + 2y) 


up to degree 3. 
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18. In each one of Exercises 1 through 9, find the terms of degree ^ 2 in the 
Taylor expansion of the function at the indicated point. 

1. P=(l,7r) 2. P = (1,7r) 3. P = (2, 3) 

4. P = (Vt, VtO 5. P = (1,2) 6. P = (0, tt) 

1. P = (tt/2, tt) 8. P = (2, tt/4) 9. P = (1, 1) 

19. Let / be a function of two variables with continuous partial derivatives of 
order ^ 2. Assume that f(jO ) = 0 and also that /(ra, rZ>) = r 2 /(a, 6) for all 
numbers r and all vectors ( a , 6). Show that for all points P = (a, b) we have 

/(/>> = • 

20. Let £/ be an open set having the following property. Given two points 
X, Y in U, the line segment joining X and Y is contained in the open set. 

(a) What is the parametric equation for this line segment? 

(b) Let / have continuous partial derivatives in U. Assume that 

||grad/(/>)|| ^ M 

for some number M, and all points P in U. Show that for any two points 
X, Y in U we have 

1 /( 20-/001 ^ M\\x- Y\\. 

§4. Integral expressions 

Quite often, instead of the mean value type of remainder obtained 
previously in Taylor’s formula, it is useful to deal with an integral form 
of the remainder. For instance, we have 

(1) /(*, y ) - /(0,0) + f o 1 ( f(ix , ty)) dt. 

This is a direct application of the definition of the integral, since we can 
put = f(tx, ty), and since 

[pt-m-m. 

If we now evaluate the derivative with respect to t under the integral, 
using the chain rule, we obtain 

f(x, y) = /(0, 0)+ f 1 [DJ(tx, ty)x + D 2 f(tx, ty)y] dt 

Jo 

or 


f(x, y) = /(o, 0) + xg i(x, y) + yg 2 (x, y). 
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where 

r 1 r2 

gi(x, y) = / Dif(tx, ty) dt and g 2 (x, y) = / D 2 f(tx , ty) dt. 

The advantage of such an expression is that the dependence of g\ and g 2 
on (x, >>) is quite smooth—just as smooth as that of D\f and D 2 f. From 
Chapter V, §2 we know that we can differentiate under the integral sign 
with respect to x and y. Thus this type of expression is often better than 
the remainder of Taylor’s formula, with an undetermined r which usually 
cannot be given explicitly, and depends on the specific choice of (x, y). 


Exercise 

1. Let/be a function of two variables, with continuous partials of order ^ 2. 
Assume that /(0,0) = 0 and A/(0, 0) = 0 for / = 1, 2. Show that there 
exist continuous functions hi such that 

f(x,y) = hi(x,y)x 2 + h 2 (x,y)xy + h$(x,y)y 2 . 

[Hint: Apply the arguments of the text to the functions 

D\f(tx, ty) and D 2 f(tx, ty) 

using an integral with respect to some new variable, say s.] 




CHAPTER VII 


Maximum and Minimum 


When we studied functions of one variable, we found maxima and 
minima by first finding critical points, i.e. points where the derivative is 
equal to 0, and then determining by inspection which of these are maxima 
or minima. We can carry out a similar investigation for functions of 
several variables. The condition that the derivative is equal to 0 must 
be replaced by the vanishing of all partial derivatives. 

§1. Critical points 

Let / be a differentiable function defined on an open set U. Let P be 
a point of U. If all partial derivatives of / are equal to 0 at P, then we 
say that P is a critical point of the function. In other words, for P to be a 
critical point, we must have 

D\f(P) = 0, ..., D n f(P) = 0. 

Example. Find the critical points of the function f(x,y) = e ~ (x2+y2 \ 

Taking the partials, we see that 

% = -2xe~ (x2+y2) and -2ij*-<**+*“>. 

dx dy r 

The only value of (x, y) for which both these quantities are equal to 0 is 
x = 0 and y = 0. Hence the only critical point is (0, 0). 

A critical point of a function of one variable is a point where the deriva¬ 
tive is equal to 0. We have seen examples where such a point need not be 
a local maximum or a local minimum, for instance as in the following 
picture: 
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A fortiori, a similar thing may occur for functions of several variables. 
However, once we have found critical points, it is usually not too difficult 
to tell by inspection whether they are of this type or not. 

Let /be any function (differentiable or not), defined on an open set U. 
We shall say that a point P of U is a local maximum for the function if 
there exists an open ball (of positive radius) B, centered at P, such that 
for all points X of B, we have 

f(X) ^ f(P). 

As an exercise, define local minimum in an analogous manner. 

In the case of functions of one variable, we took an open interval in¬ 
stead of an open ball around the point P. Thus our notion of local maxi¬ 
mum in rc-space is the natural generalization of the notion in 1-space. 

Theorem 1. Let f be a function which is defined and differentiable on 
an open set U. Let P be a local maximum for f in U. Then P is a critical 
point off. 

Proof. The proof is exactly the same as for functions of one variable. 
In fact, we shall prove that the directional derivative of / at P in any 
direction is 0. Let H be a non-zero vector. For small values of /, P + tH 
lies in the open set U, and f(P + tH) is defined. Furthermore, for small 
values of t, tH is small, and hence P + tH lies in our open ball such that 

f(P + tH) ^ f(P). 

Hence the function of one variable g(t) = f(P tH) has a local maxi¬ 
mum at t = 0. Hence its derivative g'(0) is equal to 0. By the chain rule, 
we obtain as usual: 

grad/ (P) • H = 0. 

This equation is true for every non-zero vector H, and hence 

grad/(F) = O. 

This proves what we wanted. 


Exercises 


Find the critical points of the following functions. 


1. x 2 + 4xy — y 2 — 8.v — 6 y 
3. x 2 + y 2 + z 2 
5. xy + xz 
7. x 2 y 2 


9. (x - yf 


2. x + y sin x 
4. (x + y)e~ r,J 
6. cos(;t 2 + y 2 + z 2 ) 
8. ;c 4 + y 2 
10. jc sin y 
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11. x 2 + 2y 2 - x 12. e~ (x2+1 ' 2+ * 2) 

13. e^ x2+y2+z2 ^ 

14. In each of the preceding exercises, find the minimum value of the given 
function, and give all points where the value of the function is equal to 
this minimum. [Do this exercise after you have read § 3.] 


§2. The quadratic form 


Let/ be a differentiable function on an open set U, and assume that all 
partial derivatives up to order 3 exist and are continuous. Let P be a point 
of U, and assume that P is a critical point of f. We assume that we work 
in 2-space, so that we can express/ near the point P = (a, b ) in the form 


/(a + h,b + k) = f(a, b ) 



d 7 

dx 2 


(a, b ) + 2 hk 


d 2 f 

dxdy 


(a, b ) + k 2 


d 2 f 

dy 2 


(a, b) 


+ 


where is a remainder term. Actually, we prefer to write in the other 
notation: 

f(a + h,b + k) = f(a, b ) 

+ hlh 2 Dif(.a, b ) + VikD^fta, b) + k*Dlf(a, 6)] 

+ ^3 

because we want to use x, y for other purposes. 

The function q(x, y) of x, y given by 

q(x,y) = M x 2 D\f(P) + 2*^Z),B 2 /(/>) + y 2 


is called the quadratic form associated with /at P, whenever P is a critical 
point of /. This quadratic form approximates the values of / so that 
one gets some general idea of the behavior of / near P when the terms 
of degree 1 vanish. 

V 

Example 1. Let f(x,y) = e~ (x2+v2) . Then it is a simple matter to 
verify that 


grad/(0,0) = 0. 


We let P = (0, 0) be the origin. Standard computations show that 


D\f(0) = -2, D l D 2 f(0) = 0, 2>|/(0) = -2. 


Substituting these values in the general formula gives the expression for 
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the quadratic form, namely 

q(x,y) = ~(x 2 + y 2 ). 

We see that the quadratic form is nothing but the term of degree 2 in 
the Taylor expansion of the function at the given point. 

We shall now describe the level curves for some quadratic forms to get 
an idea of their behavior near the origin. 


Example 2. q(x, y) = x 2 + y 2 . Then a graph of the function q and 
the level curves look like those in Figs. 2 and 3. 




Level curves 


In this example, we see that the origin (0, 0) is a local minimum point 
for the form. 

Example 3. q(x,y) = — (x 2 + y 2 ). The graph and level curves look 
like these: 




The origin is a local maximum for the form. 
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Example 4. q(x, y) = x 2 — y 2 . The level curves are then hyperbolas, 
determined for each number c by the equation x 2 — y 2 = c: 



Of course, when c = 0, we get the two straight lines as shown. 

Example 5. q(x, >0 = xy. The level curves look like the following 
(similar to the preceding example, but turned around): 



In Examples 4 and 5, we see that the origin, which is a critical point, 
is neither a local maximum nor local minimum. 

Exercises 

1. Let f{x, y) = 3x 2 — 4xy + y 2 . Show that the origin is a critical point off. 

2. More generally, let a, b, c be numbers. Show that the function / given by 
fix, y) = ax 2 + bxy + cy 2 has the origin as a critical point. 
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3. Find the quadratic form associated to the function / at the critical points 
P in the Exercises of §1. 


4. Sketch the level curves for the following quadratic forms, 
whether the origin is a local maximum, minimum, or neither. 


(a) q(x, y) = lx 2 - y 2 
(c) q(x, y) = —(4x 2 + 5 y 2 ) 
(e) q(x, y) = 2y 2 - x 2 
(g) q(x,y) = -Ox 2 + 2 y 2 ) 


(b) q(x, y) = 3x 2 +4 y 2 
(d) q(x, y) = y 2 - x 2 
(f) q(x, y) — y 2 — 4x 2 
(h) q(x, y) = 2 xy 


Determine 


§3. Boundary points 


In considering intervals, we had to distinguish between closed and open 
intervals. We must make an analogous distinction when considering sets 
of points in space. 

Let S be a set of points, in some «-space. Let P be a point of We 
shall say that P is an interior point of S if there exists an open ball B of 
positive radius, centered at P, and such that B is contained in S. The 
next picture illustrates an interior point (for the set consisting of the 
region enclosed by the curve). 



We have also drawn an open ball around P. 

From the very definition, we conclude that the set consisting of all 
interior points of S is an open set. 

A point P (not necessarily in S ) is called a boundary point of S if every 
open ball B centered at P includes a point of S, and also a point which 
is not in S. We illustrate a boundary point in the following picture: 


Figure 9 

For example, the set of boundary points of the closed ball of radius 
a > 0 is the sphere of radius a. In 2-space, the plane, the region con¬ 
sisting of all points with y > 0 is open. Its boundary points are the 
points lying on the x-axis. 
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If a set contains all of its boundary points, then we shall say that the 
set is closed. 

Finally, a set is said to be bounded if there exists a number b > 0 such 
that, for every point X of the set, we have 

11*11 ^ b. 

We are now in a position to state the existence of maxima and minima 
for continuous functions. 

Theorem 2. Let S be a closed and bounded set. Let f be a continuous 
function defined on S. Then f has a maximum and a minimum in S. In 
other words, there exists a point P in S such that 

f(P) ^ f(X) 

for all X in S, and there exists a point Q in S such that 


for all X in S. 


f(Q) ^ /(*) 


We shall not prove this theorem. It depends on an analysis which is 
beyond the level of this course. 

When trying to find a maximum (say) for a function f one should first 
determine the critical points of / in the interior of the region under con¬ 
sideration. If a maximum lies in the interior, it must be among these 
critical points. 

Next, one should investigate the function on the boundary of the region. 
By parametrizing the boundary, one frequently reduces the problem of 
finding a maximum on the boundary to a lower-dimensional problem, 
to which the technique of critical points can also be applied. 

Finally, one has to compare the possible maximum of/on the boundary 
and in the interior to determine which points are maximum points. 

Example. In the Example in §1, we observe that the function 


f{x, y ) = e ~ ix2+v2) 


becomes very small as x or y becomes large. Consider some big closed 
disc centered at the origin. We know by Theorem 2 that the function 
has a maximum in this disc. Since the value of the function is small on 
the boundary, it follows that this maximum must be an interior point, 
and hence that the maximum is a critical point. But we found in the 
Example in §1 that the only critical point is at the origin. Hence we con¬ 
clude that the origin is the only maximum of the function fix, y). The 
value of /at the origin is /(0, 0) = 1. Furthermore, the function has no 
minimum, because fix, y) is always positive and approaches 0 as x and y 
become large. 
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Exercises 

Find the maximum and minimum points of the following functions in the 
indicated region. 

1. x + y in the square with corners at (±1, ±1) 

2. (a) x + y + z in the region x 2 + y 2 + z 2 < 1 
(b) x + y in the region x 2 + y 2 < 1 

3. xy — (1 — x 2 — y 2 ) 1/2 in the region x 2 + y 2 ^ 1 

4. I44.x 3 y 2 (l — x — y) in the region * ^ 0 and y 25 0 (the first quadrant 
together with its boundary) 

5. (x 2 + 2y 2 )e~ (x2+v2) in the plane 

6. (a) ( x 2 + y 2 ) -1 in the region (x — 2) 2 + y 2 ^ 1 

(b) (x 2 + y 2 ) -1 in the region x 2 + (y — 2) 2 ^ 1 

7. Which of the following functions have a maximum and which have a 
minimum in the whole plane? 

(a) (x + 2 y)e- x2 ~ vi (b) e*~* 

(o <d) 1 2+ ‘ 10 

(e) (3x 2 + 2/)e~ l4 ‘* + “ !> (t) -xV‘ +y “ 

(8) (i ll + bi if O’°> 

l 0 if (x, y) = (0,0) 

8. Which is the point on the curve (cos t, sin t, sin(//2)) farthest from the 
origin? 


§4. Lagrange multipliers 

In this section, we shall investigate another method for finding the 
maximum or minimum of a function on some set of points. This method 
is particularly well adapted to the case when the set of points is described 
by means of an equation. 

We shall work in 3-space. Let g be a differentiable function of three 
variables x, y, z. We consider the surface 

g(X) = 0. 

Let U be an open set containing this surface, and let/ be a differentiable 
function defined for all points of U. We wish to find those points P on 
the surface g(X) = 0 such that f(P ) is a maximum or a minimum on the 
surface. In other words, we wish to find all points P such that g(P) = 0, 
and either 


f(P) ^ f(X ) for all X such that g{X) = 0, 
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or 

f(P) ^ f(X) for all X such that g(X) = 0. 

Any such point will be called an extremum for/subject to the constraint g. 

In what follows, we consider only points P such that g(P ) = 0 but 
grad g(P) ^ O 

Theorem 3. Let g be a continuously differentiable function on an open 
set U. Let S be the set of points X in U such that g(X) = 0 but 

grad g (X) O. 

Let f be a continuously differentiable function on U and assume that P 
is a point of S such that P is an extremum for f on S. (In other words, 
P is an extremum for f, subject to the constraint g.) Then there exists a 
number X such that 


grad/(P) = X grad g (P). 

Proof. Let X: J —> S be a differentiable curve on the surface S passing 
through P, say X(t 0 ) = P. Then the function t f(X(t)) has a maxi¬ 
mum or a minimum at t 0 . Its derivative 

is therefore equal to 0 at / 0 - But this derivative is equal to 

grad f(P) • X'(t 0 ) = 0. 

grad / (P) = X grad g ( P ) 
grad g(P) 

plane 
at P 

Figure 10 



Hence grad/ (P) is perpendicular to every curve on the surface passing 
through P (Fig. 10). Under these circumstances, and the hypothesis 
that grad g(P) ^ O, there exists a number X such that 

(1) grad /CP) = X grad g (P), 
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or in other words, grad/( P ) has the same, or opposite direction, as 
grad g(JP), provided it is not O. This is rather clear, since the direction of 
grad g(P) is the direction perpendicular to the surface, and we have seen 
that grad f(P) is also perpendicular to the surface. 

Conversely, when we want to find an extremum point for / subject to 
the constraint g, we find all points P such that g(P) = 0, and such that 
relation (1) is satisfied. We can then find our extremum points among 
these by inspection. 

(Note that this procedure is analogous to the procedure used to find 
maxima or minima for functions of one variable. We first determined all 
points at which the derivative is equal to 0, and then determined maxima 
or minima by inspection.) 

Example 1. Find the maximum of the function f(x, y) = x + y sub¬ 
ject to the constraint x 2 -f j 2 = 1. 

We let g(x, y) = x 2 + y 2 — 1, so that S consists of all points (x, j) 
such that g(x, y) = 0. We have 

grad f(x,y) = (1, 1), 
grad g (x, y) = (2x, 2y). 

Let (x 0 , Jo) he a point for which there exists a number X satisfying 

grad/(x 0 ,J 0 ) = X grad g (x 0 , y 0 ). 


or in other words 

1 = 2x 0 X and 1 = 2j 0 X. 


Then x 0 ^ 0 and y 0 ^ 0. Hence X = l/2x 0 = l/2j 0 , and consequently 
x 0 = jo- Since the point (x 0 , Jo) must satisfy the equation g(x 0 , Jo) = 0, 
we get the possibilities: 


Xq — ± 


1 

V2 


and jo = ± — • 
v2 


It is then clear that (l/\/2, 1/V2) is a maximum for / since the only 
other possibility ( — 1/\/2, — l/\/2) is a point at which /takes on a nega¬ 
tive value, and /(1/V2, l/\/2) = 2/y/2 > 0. 

Example 2. Find the extrema for the function x 2 + j 2 + z 2 subject 
to the constraint x 2 + 2j 2 — z 2 — 1 = 0. 
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Computing the partial derivatives of the functions/ and g, we find that 
we must solve the system of equations 

(a) 2x = X • 2x, (b) 2y = X • 4 y, 

(c) 2z = X • ( —2z), (d) g(X) = x 2 + 2y 2 - z 2 - 1 = 0. 

Let (x 0 , y o> ^o) be a solution. If z 0 ^ 0, then from (c) we conclude 
that X = — 1. The only way to solve (a) and (b) with X = — 1 is that 

x = y — 0. In that case, from (d), we would get 

4 = -l, 

which is impossible. Hence any solution must have z 0 = 0. 

If x 0 0, then from (a) we conclude that X = 1. From (b) and (c) 

we then conclude that y Q = z Q = 0. From (d), we must have x 0 = ±1. 

In this manner, we have obtained two solutions satisfying our conditions, 
namely 

(1,0,0) and (-1,0,0). 

Similarly, if y Q 0, we find two more solutions, namely 
(0,V*,0) and (0, -\/|,0). 

These four points are therefore the extrema of the function / subject to 
the constraint g. 

If we ask for the minimum of /, then a direct computation shows that 
the last two points 

(0, ±\/|, 0) 

are the only possible solutions (because 1 > ^). 

Exercises 

1. (a) Find the minimum of the function x -j- y 2 subject to the constraint 

2x 2 + y 2 = 1. 

(b) Find its maximum. 

2. Find the maximum value of x 2 + xy + y 2 + yz + z 2 on the sphere of 
radius 1. 

3. Let A = (1,1, — 1), B = (2,1, 3), C = (2, 0, —1). Find the point at 
which the function 

f(X) = (X- A) 2 + (X - E) 2 + (X- Q 2 

reaches its minimum, and find the minimum value. 

4. Do Exercise 3 in general, for any three distinct vectors 

A = (ai,a2,az), B= (bi,b2,bg), C = (ci,C 2 , C 3 ). 
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5. Find the maximum of the function 3x 2 + 2 \/2 xy + 4 y 2 on the circle of 
radius 3 in the plane. 

6 . Find the maximum of the functions xyz subject to the constraints x ^ 0, 
y ^ 0, z ^ 0, and xy + yz + xz = 2. 

7. Find the maximum and minimum distance from points on the curve 

5x 2 + 6 xy + 5 y 2 = 0 

to the origin in the plane. 

8 . Find the extreme values of the function cos 2 * + cos 2 y subject to the 
constraint x — y = ir/4 and 0 ^ x ^ ir. 

9. Find the points on the surface z 2 — xy = 1 nearest to the origin. 

10. Find the extreme values of the function xy subject to the condition 
x + y = 1. 

11. Find the shortest distance between the point (1, 0) and the curve y 2 = 4x. 

12. Find the maximum and minimum points of the function 

f(x, y, z) = x + y + z 
in the region x 2 + y 2 + z 2 ^ 1. 

13. Find the extremum values of the function f(x, y, z) = x — 2y + 2z on 
the sphere x 2 + y 2 + z 2 = 1. 

14. Find the maximum of the function f(x, y, z) = x + y + z on the sphere 
x 2 + y 2 + z 2 = 4. 

15. Find the extreme values of the function / given by f(x, y, z) = xyz subject 
to the condition x + y + z = 1. 

16. Find the extreme values of the function given by f(x, y, z) = (jc + y -f z) 2 
subject to the condition x 2 + 2 y 2 + 3z 2 = 1. 

17. Find the minimum of the function f(x,y,z) = x 2 + y 2 + z 2 subject to 
the condition 3x + 2y — lz = 5. 

18. In general, if a, b, c, d are numbers with not all of a, b, c equal to 0, find the 
minimum of the function x 2 + y 2 + z 2 subject to the condition 

ax + by + cz = d. 

19. Find the maximum and minimum value of the function 

f{x, y) = x 2 + 2 y 2 - x 
on the disc of radius 1 centered at the origin. 

20. Find the shortest distance from a point on the ellipse x 2 + 4 y 2 = 4 to the 
line x + y = 4. 
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MATRICES, LINEAR MAPS 
AND DETERMINANTS 



The present book is organized by topics. This means that we may go 
deeper in one part than is necessary to read another part. Certain portions 
may therefore be omitted without impairing the understanding of later 
portions. 

Specifically, the reader may omit this entire part, and go immediately 
to the chapter on multiple integrals, which can be read after Chapter I. 
In a one-term course, you can cover Green’s theorem also without any 
knowledge of matrices or determinants. 

The amount of “linear algebra” discussed here is kept at a minimum, 
and is intended for applications to the Jacobian matrix in Chapter XI, 
and to the surface or volume integrals occurring in Chapter XV. Thus 
you may postpone this part until you read those sections. These applica¬ 
tions describe in various contexts how an arbitrary mapping can be 
approximated by a linear mapping. 



CHAPTER VIII 


Matrices 


§1. Matrices 


We consider a new kind of object, matrices. 

Let n, m be two integers ^ 1. An array of numbers 

( #11 
#21 

a'm\ 

is called a matrix. We can abbreviate the notation for this matrix by 
writing it (a,-y), i = 1 , ,m and j = l,... ,n. We say that it is an m 
by n matrix, or an m X n matrix. The matrix has m rows and n columns. 
For instance, the first column is 


a \2 

#13 

' ’ ’ #ln 

a 22 

• 

#23 

• • • #2n 

* 


#m3 

... 

u mn 



and the second row is (a 2 1 , a 22 , . . • , a 2n ). We call a { j the gentry or 
//-component of the matrix. 

If you look back at Chapter I, §1, the example of 7-space taken from 
economics gives rise to a 7 X 7 matrix (a^) ( i,j= 1, . . . , 7), where a z y 
is the amount spent by the i-th industry on they-th industry. Thus keeping 
the notation of that example, if a 2 5 = 50, this means that the auto in¬ 
dustry bought 50 million dollars’ worth of stuff from the chemical industry 
during the given year. 

Example 1. The following is a 2 X 3 matrix: 



it has two rows and three columns. 
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The rows are (1, 1, —2) and (—1,4, —5). The columns are 



Thus the rows of a matrix may be viewed as n-tuples, and the columns 
may be viewed as vertical ra-tuples. A vertical m-tuple is also called a 

column vector. 

A vector (xi,.. ., x n ) is a 1 X n matrix. A column vector 



is an n X 1 matrix. 

When we write a matrix in the form (a t y), then i denotes the row and 
j denotes the column. In Example 1, we have for instance an = 1, 

23 = ~ 5 . 

A single number (a) may be viewed as a 1 X 1 matrix. 

Let (ciij), i = 1, . . ., m and j = 1, . . . , n be a matrix. If m = n, 
then we say that it is a square matrix. Thus 



are both square matrices. 

We have a zero matrix, in which a tJ - = 0 for all i,j. It looks like this: 


( a 0 0 • • • 0 

0 0 0 ••• 0 

i i 6 ... 6 


We shall write it O. We note that we have met so far with the zero num¬ 
ber, zero vector, and zero matrix. 

We shall now define addition of matrices and multiplication of matrices 
by numbers. 

We define addition of matrices only when they have the same size. 
Thus let m, n be fixed integers ^ 1. Let A = {a i3 ) and B = (6,-y) be 
two m X n matrices. We define A + B to be the matrix whose entry in 
the /-th row and j-t h column is a^ + bij. In other words, we add matrices 
of the same size componentwise. 
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Example 2. Let 


A - 





and 



1 -1 

1 -1 




Then 



If A, B are both 1 Xn matrices, i.e. n-tuples, then we note that our 
addition of matrices coincides with the addition which we defined in 
Chapter I for n-tuples. 

If O is the zero matrix, then for any matrix A (of the same size, of 
course), we have 0 + A = A-\-0 = A. This is trivially verified. 

We shall now define the multiplication of a matrix by a number. Let 
c be a number, and A = (o^) be a matrix. We define cA to be the matrix 
whose //-component is cay. We write cA = (cay). Thus we multiply 
each component of A by c. 


Example 3. Let A, B be as in Example 2. Let c = 2. Then 


2 A = 



-2 

6 



and 



2 

2 



For any matrix A we let — A be the matrix obtained by multiplying each 
component of A with —1. If A = (ay), then 


-A = (~l)A = (-ay). 


For instance, if 



is the matrix of Example 2, then 


-A = (-1 )A = 




Observe that for any matrix A we have 


A + ( -A) = A - A = 0. 
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We define one more notion related to a matrix. Let A = ( 'an ) be an 
m X n matrix. The n X m matrix B = ( bji ) such that bji = ciij is called 
the transpose of A, and is also denoted by l A. Taking the transpose of a 
matrix amounts to changing rows into columns and vice versa. If A is 
the matrix which we wrote down at the beginning of this section, then l A 
is the matrix 


'«11 

#21 

#31 

&m\' 

#12 

#22 

#32 - • ’ 

a m2 

>#in 

#2n 

#3n • • ' 

&mif 


To take a special case: 

If A = 1 °^, then *A 

If A = (2, 1, —4) is a row vector, then 



is a column vector. 



« 


1. Let 


Exercises 





5 

1 



Find A + B, 3 B, -2B, A + 2B, 2A + B, A - B, A - 2 B, B 



A. 


Find A + B, 3 B, -2B, A + 2B, A - B, B - A. 

3. In Exercise 1, find l A and l B. 

4. In Exercise 2, find l A and ‘B. 

5. If A, B are arbitrary m X n matrices, show that 


‘(A + B) = ‘A + l B. 


6. If c is a number, show that '( cA ) = c l A. 
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7. If A = (ciij) is a square matrix, then the elements a u are called the diagonal 
elements. How do the diagonal elements of A and 1 A differ? 

8. Find l (A + B ) and 'A + ‘B in Exercise 2. 

9. Find A + ‘A and B + l B in Exercise 2. 

10. A matrix A is said to be symmetric if A = l A. Show that for any square 
matrix A, the matrix A + ‘A is symmetric. 

11. Write down the row vectors and column vectors of the matrices A, B in 
Exercise 1. 

12. Write down the row vectors and column vectors of the matrices A, B in 
Exercise 2. 


§2. Multiplication of matrices 


We shall now define the product of matrices. Let A 
i = 1 , ,m and j = 1, . . . , n be an m X n matrix. Let B 
j = and k — 1, . . . , s be an n X s matrix. 




A - 



• • • 




• • • 



We define the product AB to be the m X s matrix whose /^-coordinate is 

> 

n 

^ Oijbjk = cii\b\k a^b-zk + * • • + cti n b n k■ 

?=i 


If A x , , A m are the row vectors of the matrix A, and if B l , . . . , B s 
are the column vectors of the matrix B, then the //^-coordinate of the 
product AB is equal to Ai • B k . Thus 


( A x • B 1 ••• Ax • B s \ 

: : • 

Am-B 1 ••• A m • B S J 

Multiplication of matrices is therefore a generalization of the dot product. 


Example 1. Let 





Then AB is a 2 X 2 matrix, and computations show that 


AB 


'2 1 5’ 
,1 3 2 j 
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Example 2. Let 



Let A, B be as in Example 1. Then 



Compute (AB)C. What do you find? 

Let A be an m X n matrix and let B be an n X 1 matrix, i.e. a column 
vector. Then AB is again a column vector. The product looks like this: 



where 

* 

n 

Ci 'y ^ Oijbj ®i\b\ "f" ' ‘ " “l” 0 { n b n . 

3=1 

If X = (xi,, x m ) is a row vector, i.e. a 1 X m matrix, then we can 
form the product XA, which looks like this: 


where 


(x 


i) 


Xrn) 




Cy !>•••» 


ytc -X1# 1 k “1” * * ’ H - Xm@mk‘ 


In this case, XA is a l X n matrix, i.e. a row vector. 

If A is a square matrix, then we can form the product AA, which will 
be a square matrix of the same size as A. It is denoted by A 2 . Similarly, 
we can form A 3 , A 4 , and in general, A n for any positive integer n. 

We define the unit n X n matrix to be the matrix having diagonal 
components all equal to 1, and all other components equal to 0. Thus 
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the unit n X n matrix, denoted by /„, looks like this: 



We can then define A 0 = I (the unit matrix of the same size as A). 

Theorem 1. Let A, B, C be matrices. Assume that A, B can be multi¬ 
plied, and A, C can be multiplied, and B, C can be added. Then A, B -f- C 
can be multiplied, and we have 

A(B + C) = AB + AC. 

If x is a number, then 

, A(xB) = x(AB). 

Proof. Let Ai be the z'-th row of A, and let B k , C k be the k-th column 
of B and C respectively. Then B k + C k is the k-th column of B + C. 
By definition, the Ik-component of AB is A { • B k , the zk-component of 
AC is Ai • C k , and the z'k-component of A(B + C) is A{ • (B k + C k ). 
Since 

Ai • ( B k + C k ) = Ai * B k + Ai ■ C k , 

our first assertion follows. As for the second, observe that the k-th column 
of xB is xB k . Since 

At • xB k = x(Ai ■ B k ), 
our second assertion follows. 

Theorem 2. Let A, B, C be matrices such that A, B can be multiplied 
and B, C can be multiplied. Then A, BC can be multiplied, so can AB, C, 
and we have 

(AB)C = A(BC). 

Proof. Let A = (a t -y) be an m X n matrix, let B = (bjk) be an n X r 
matrix, and let C = (c*z) be an r X s matrix. The product AB is an 
m X r matrix, whose z'k-component is equal to the sum 

Qi\b\k + ^i2^2k Qinbfik* 

We shall abbreviate this sum using our £ notation by writing 

n 

y-i 
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By definition, the //-component of ( AB)C is equal to 


r 

n 


r 

n 

z 

'y v Qijbjk 

Ckl = 

- E 

X] ®ijbjkCkl 

1 

Ly=i J 


1 

Lj=i J 


The sum on the right can also be described as the sum of all terms 

Qijb jk^kh 

where j, k range over all integers 1 ^ j ^ n and 1 ^ k r respectively. 

If we had started with the ^/-component of BC and then computed the 
//-component of A(BC ) we would have found exactly the same sum, 
thereby proving the theorem. 


Exercises 


1. Let / be the unit n X n matrix. Let A be an n X r matrix. What is IA ? 
If A is an m X n matrix, what is All 

2. Let O be the matrix all of whose coordinates are 0. Let A be a matrix of 
a size such that the product AO is defined. What is AOl 

3. In each one of the following cases, find ( AB)C and A(BC). 


(a) A = 





(b) A 



(c) A = 



1 

1 

4 

0 




2 

1 

4 


4. Let A, B be square matrices of the same size, and assume that AB = BA. 
Show that (A + B) 2 = A 2 + 2 AB -f B 2 , and 


(A + B)(A - B) = A 2 - B 2 , 


using the properties of matrices stated in Theorem 1. 

5. Let 



Find AB and BA. 
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6. Let 



Let A, B be as in Exercise 5. Find CA, AC, CB, and BC. State the general 
rule including this exercise as a special case. 

7. Let X = (1, 0, 0) and let 

r 1 

A = I 2 0 

\l 1 

What is XA ? 

8. Let X = (0, 1,0), and let A be an arbitrary 3X3 matrix. How would 
you describe XA? What if X = (0, 0, 1)? Generalize to similar statements 
concerning n X n matrices, and their products with unit vectors. 



/0 2 \ 

9. Let A = [ ) . What is A 2 ? 

\0 0 / 

/ 0 0 \ 

10. Let A = ( ). What is A 2 ? 

V—5 0/ 

11. (a) Let A be the matrix 



Find A 2 , A 3 . Generalize to 4 X 4 matrices. 


(b) Let A be the matrix 





/' 

1 

‘\ 


(° 

1 

1 

Compute A 2 , A 3 , and A 4 . 

Vo 

0 

1 / 

12. Let X be the indicated column vector, 

and A the indicated matrix. Find 


AX as a column vector. 
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(c) X = [ x 2 1 > A = 


0 1 0 


0 0 0 


13. Let 


2 1 3 
4 1 5 


Find AX for each of the following values of X. 


(d) X = [x 2 ]’A = 


0 0 0 

1 0 0 . 


(a) X = 0 (b) * = 1 


(c) X = 0 


(d) * = 


14. Let 


A = 1 -1 


\2 1 8 / 

Find /IX for each of the values of X given in Exercise 13. 

15. Let 


What is AX? 

16. Let A' be a column vector having all its components equal to 0 except the 
/-th component which is equal to 1. Let A be an arbitrary matrix, whose 
size is such that we can form the product AX. What is AX? 

17. Let AT be a row vector having all its components equal to 0 except the 
y-th component which is equal to 1. Let A be an arbitrary matrix, whose 
size is such that we can form the product A 'A. What is A A ? Work out 
special cases when X has 2 components, then when X has three components. 

18. Let a , b be numbers, and let 


(:;) 


What is AB? What is A n where n is a positive integer? 

19. If A is a square n X n matrix, we call a square matrix B an inverse for A 
if AB — BA = I n . Show that if B, C are inverses for A, then B = C. 

20. Show that the matrix A in Exercise 18 has an inverse. What is this inverse? 
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21. Show that if A, B are n X n matrices which have inverses, then AB has an 
inverse. 


22 . 

23. 


Determine all 2 X 2 matrices A such that A 

( cos 9 —sin 9\ „ 

J • Show that A -- 
sin 9 cos 9/ 

Determine A n by induction for any positive 


2 = O. 
/cos 29 

\sin 29 

integer n. 


— sin 29' 
cos 29 


24. Find a 2 X 2 matrix A such that A 2 

25. Let 




0 

2 

0 



1 

0 



Find A 2 , A 3 , A 4 . 

26. Let A be a diagonal matrix, with diagonal elements ai ,. . . , a n . What is 
A 2 , A 3 , A k for any positive integer &? 


27. Let 



Find A 3 . 




CHAPTER IX 


Linear Mappings 


We shall first define the general notion of a mapping, which generalizes 
the notion of a function. Among mappings, the linear mappings are the 
most important. A good deal of mathematics is devoted to reducing 
questions concerning arbitrary mappings to linear mappings. For one 
thing, they are interesting in themselves, and many mappings are linear. 
On the other hand, it is often possible to approximate an arbitrary map¬ 
ping by a linear one, whose study is much easier than the study of the 
original mapping. This is done in the calculus of several variables. 

§1. Mappings 

As usual, a collection of objects will be called a set. A member of the 
collection is also called an element of the set. It is useful in practice to 
use short symbols to denote certain sets. For instance we denote by R the 
set of all numbers. To say that “x is a number” or that “x is an element 
of R” amounts to the same thing. The set of n-tuples of numbers will 
be denoted by R”. Thus “X is an element of R n ” and “A” is an n-tuple” 
mean the same thing. Instead of saying that u is an element of a set S, 
we shall also frequently say that u lies in S and we sometimes write u E S. 
If S and S' are two sets, and if every element of S' is an element of S, then 
we say that S' is a subset of S. Thus the set of rational numbers is a subset 
of the set of (real) numbers. To say that S is a subset of S' is to say that S 
is part of S'. To denote the fact that S is a subset of S', we write S C S'. 

If Si, S 2 are sets, then the intersection of Si and S 2 , denoted by 
Si fl S 2 , is the set of elements which lie in both S x and S 2 . The union 
of Si and S 2 , denoted by Si U S 2 , is the set of elements which lie in Si 
or S 2 . 

Let S, S' be two sets. A mapping from S to S' is an association which 
to every element of S associates an element of S'. Instead of saying that 
Fis a mapping from S into S', we shall often write the symbols F: S —> S'. 
A mapping will also be called a map, for the sake of brevity. 

A function is a special type of mapping, namely it is a mapping from 
a set into the set of numbers, i.e. into R. 
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We extend to mappings some of the terminology we have used for 
functions. For instance, if T: S —> S' is a mapping, and if u is an element 
of S, then we denote by T(u), or Tu, the element of S' associated to u 
by T. We call T(u ) the value of T at u, or also the image of u under T. 
The symbols T(u ) are read ‘T of w”. The set of all elements T(u), when 
u ranges over all elements of S, is called the image of T. If IF is a subset 
of S, then the set of elements T{w), when w ranges over all elements of W, 
is called the image of W under T ’, and is denoted by T(W). 


Let F: S —> S' be a map from a set S into a set S'. If x is an element 
of S, we often write 


x i—> F(x) 


with a special arrow i—► to denote the image of x under F. Thus, for 
instance, we would speak of the map F such that F(x) = x 2 as the map • 
x ^ x 2 . 


Example. Let S and S' be both equal to R. Let /: R —» R be the 
function f(x) = x 2 (i.e. the function whose value at a number x is x 2 ). 
Then /is a mapping from R into R. Its image is the set of numbers ^ 0. 

Example. Let S be the set of numbers 0, and let S' = R. Let 
g: S —> S' be the function such that g(x) = x 1/2 . Then g is a mapping 
from S into R. 

Example. Let S be the set R 3 , i.e. the set of 3-tuples. Let 
A = (2, 3, — 1). Let L: R 3 —* R be the mapping whose value at a vector 
X = (x, y, z)isA- X. Then L(X) = A • X. If X = (1, 1, - 1), then the 
value of L at X is 6. 

Just as we did with functions, we describe a mapping by giving its 
values. Thus, instead of making the statement in Example 5 describing 
the mapping L, we would also say: Let L: R 3 —> R be the mapping 
L(X) = A • X. This is somewhat incorrect, but is briefer, and does not 
usually give rise to confusion. More correctly, we can write X >—> L(X) 
or X A • X with the special arrow ►-» to denote the effect of the map L 
on the element X. 

Example. Let F: R 2 —> R 2 be the mapping given by 

y ) = (2x, 2 y). 

Describe the image under F of the points lying on the circle x 2 + y 2 — 1. 

Let (x, y) be a point on the circle of radius 1. 

Let u — 2x and v = 2y. Then u, v satisfy the relation 

0/2) 2 + (v/2) 2 = 1 
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or in other words, 



Hence (w, v) is a point on the circle of radius 2. Therefore the image 
under Fof the circle of radius 1 is a subset of the circle of radius 2. Con¬ 
versely, given a point ( u , v ) such that :■ 

u 2 + v 2 = 4, 

let x = u/2 and y = v/2. Then the point (x, y) satisfies the equation 
x 2 + y 2 = 1, and hence is a point on the circle of radius 1. Furthermore, 
F(x, y) = (u, v). Hence every point on the circle of radius 2 is the image 
of some point on the circle of radius 1. We conclude finally that the 
image of the circle of radius 1 under F is precisely the circle of radius 2. 

Note. In general, let S, S' be two sets. To prove that S = S', one 
frequently proves that S is a subset of S' and that S' is a subset of S. This 
is what we did in the preceding argument. 

Observe that the association 

(x, y) i * (2.x, 2y) 

is a dilation, i.e. a stretching by a factor of 2. Each point (x, y) is set on the 
point (2x, 2y) which lies on the same ray from the origin, at twice the 
distance from the origin, as illustrated on Fig. 1. 



Example. In general, let r be a positive number. The association 

(x, y) (rx, ry) 

is called dilation by the factor of r. We can also define it in 3-space, by 

(x, y, z) h-> (rx, ry, rz). 

We shall study such dilations later when we take up area and volume, and 
we shall see how these change under dilations. 
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Example. A curve in space as we studied in Chapter II was a mapping. 
For instance, we can define a map 

F: R •-* R 3 

by the association 

t ^ (It, 10', t 3 ). 

Thus F(t ) = (It, 10', t 3 ), and the value of F at 2 is 

F( 2) = (4, 100, 8). 

In such a mapping we call 

/i(0 = 2/, /„(/) = 10', f 3 (t) = t 3 

the coordinate functions of the mapping. 

In general, a mapping F: R —> R 3 can always be expressed in terms of 
such functions, and we write 

m = (/!«), MDJM)- 

Example. Polar Coordinate Mapping. Let F: R 2 —> R 2 be the mapping 
defined by 

F(r, 9) = ( r cos 6, r sin 9). 

Thus we may put 

x = r cos 9 
y = r sin 9. 


Then F is a mapping, which is called the polar coordinate mapping. We 
see that x and y depend on r, 9, and x, y are the coordinate functions of 
the mapping. Again, we shall study this mapping later when we change 
coordinates in a double integral. You should get well acquainted with 
this mapping, and we work out one example of what it does. Let S be the 
rectangle consisting of all points (r, 9) such that 

0^r^2 and 0^9^ r/2. 

We want to describe the image of S under the polar coordinate mapping. 
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The image of 5 under the polar coordinate map F consists of all points 
(x, >>) whose polar coordinates ( r, 9) satisfy the above inequalities. We 
see that the image is just the sector of radius 2 in the first quadrant as 
shown on Fig. 2. 

Example. Translations. Let A be a vector, say in the plane. We let 

T a : R 2 -> R 2 

be the mapping such that 

T a (X) = X + A. 

We call T a the translation by A. On Fig. 3 we have drawn the translations 
of various points F, Q, M under translation by A. We may describe the 
image of a point P under translation by A as the point obtained from P 
by moving P in the direction of A, for a distance equal to the distance 
between O and A. Of course, the same notion also works in higher 
dimensional space. If A is an n-tuple, then 

T a \ R n —> R” 

is the mapping defined by the same equation as above, namely 

T a (X) = X + A. 

You can visualize the picture (at least in R 3 ) similarly. 



Figure 3 


Example. You should not forget the identity mapping /, defined on any 
set S, and such that I(x) = x for all x in S. 


Exercises 

1. Let L(X) = A • X, where A = (2,3, —1). Give L(X) when X is the vector: 

(a) (1,2,-3) (b) (-1,5,0) (c) (2,1,1) 

2. Let F: R —» R 2 be the mapping such that F(t) = (e‘, t ). What is F(l), 
F(0),F(-1)? 

3. Let A = (1, 1, —1, 3). Let F: R 4 —> R be the mapping such that for any 
vector X = (xi, X2, *3, x±) we have F(X) = X • A +2. What is the value 
of F(X) when (a) X = (1,1,0, -l)and (b) X = (2,3, -1,1)? 
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In each case, to prove that the image is equal to a certain set S, you must prove 
that the image is contained in S, and also that every element of S is in the image. 

4. Let F : R 2 —» R 2 be the mapping defined by F(x, y) — ( 2x , 3y). Describe 
the image of the points lying on the circle x 2 + y 2 = 1. 

5. Let F : R 2 —> R 2 be the mapping defined by F{x,y) = (xy, y). Describe 
the image under F of the straight line x = 2. 

6. Let F be the mapping defined by F(x, y) — ( e x cos y, e x sin y). Describe 
the image under F of the line x = 1. Describe more generally the image 
under F of a line x = c, where c is a constant. 

7. Let F be the mapping defined by F{t, u) = (cos t, sin t, u). Describe geo¬ 
metrically the image of the ( t , «)-plane under F. 

8. Let F be the mapping defined by F(x, y) = (x/3, y/4). What is the image 
under F of the ellipse 


9. Draw the images of the following sets S under the polar coordinate mapping. 
In each case, the set S consists of all points (r, 6 ) satisfying the stated 
inequalities. 

(a) 0 ^ r < 1 and 0 < 0 < tt/3 

(b) 0 ^ r g 3 and 0 ^ 6 ^ 3tt/4 

(c) 1 ^ r ^ 2 and tt/4 ^ 6 ^ 3tt/4 

(d) 1 ^ r < 2 and tt/3 ^ 6 ^ 2tt/3 

(e) 2 g r ^ 3 and ir /6 ^ 6 ^ tt/4 

(f) 2 ^ r ^ 3 and tt/6 ^ d ^ tt/3 

(g) 3 ^ r ^ 4 and tt/2 ^ 6 ^ 2tt/3 

10. In general, let S be the rectangle defined by the inequalities 

0 < r\ ^ r ^ r 2 and 0 < di ^ 9 ^ 62 - 

Describe the image of S under the polar coordinate mapping. 

11. Let A = (—1, 2). Draw the image of the point X under translation by A 
when 

(a) X = (2, 3) (b) X = ( — 5,2) (c) X = (1,1) 

12. The identity mapping of R" is equal to a translation T A for some vector A. 
True or false? If true, which vector A ? 

13. Draw the image of the following figures under translation T A , where 
A = (-1,2). 



(a) The circle as shown: 
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(b) The square, as shown: 


1 








3 


2 


(c) The circle as shown: 1 

- 

1 1 I 

0 

1 2 3 


(d) The square as shown: 



-2 


§2. Linear mappings 

Consider two Euclidean spaces R" and R m . In the applications, the values 
for m and n are 1, 2, or 3, but they can all occur, so it is just as easy to 
leave them indeterminate for what we are about to say. 

A mapping 

L: R n —> R m 

is called a linear mapping if it satisfies the following properties: 

LM 1. For any elements X, Y in R n we have 

L(X+Y) = L(X) + L(Y). 

LM 2. If c is a number , then 

L(cX) = cL(X). 

These properties should remind you of properties of multiplication of 
matrices, and also of the dot product of n-tuples. These in fact provide us 
with the examples which interest us for this course. 
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Example. Let A = (3, 1, —2). Then we have a linear map 

L a : R 3 —> R 

defined by the dot product, 

L a (X) = A X. 

If X = ( x, y, z), then 

L a (X) = 3x + y - 2z. 

In general, let 

a n 
&m 1 

be an m X n matrix. We can then associate with A a map 

L a : R n -^R M 

by letting 

L a (X) = AX 




for every column vector X in R n . Thus L A is defined by the association 
X i—> AX, the product being the product of matrices. That La is linear 
is simply a special case of Theorem 1, Chapter VIII, §2, namely the theorem 
concerning properties of multiplication of matrices. Indeed, we have 

A(X + Y) = AX + AY and A(cX) = cAX 


for all vectors X, Y in R n and all numbers c. We call L A the linear map 
associated with the matrix A. We also say that A is the matrix representing 
the linear map L A . 


Example. If 



then 


L a (X) = 




6 + 7 
-3 + 35, 



Theorem 1. If A, B are m X n matrices and if La = Lb, then A = B. 
In other words, if matrices A, B give rise to the same linear map , then 
they are equal. 

Proof By definition, we have + • X = B{ ■ X for all i, if Ai is the 
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/-th row of A and Bi is the /-th row of B. Hence (Ai — Bi) • X = 0 for 
all /and all X. Hence A t — Bi = O, and Ai = J?,-for all/. Hence A = B. 

It can easily be shown that every linear map from R n into R w is of the 
form La for some matrix A, in other words, the above example is the most 
general type of linear map from R n into R m . The matrix A is called the 
matrix associated with the linear map. We shall give the proof when 
n = 2. 


Let E 1 = 


and E 2 = 


be the standard unit vectors. Let 


L: R 2 —» R 2 be a linear map such that 


HE 1 ) = 


and L(£ 2 ) = 


We shall prove that the matrix associated with L is precisely 


First note that 


A = 


a b \/1 


c d \ 0 


a b s 
c d, 


= L{E l ) 


a b\ / 0 


c d \ 1 


= L(E 2 ). 


Letx= U , so that X = xE 1 + yE 2 . Then 

L(X) = UxE 1 ) + L(yE 2 ) = xL{E l ) + yL(E 2 ) 

= xAE 1 + yAE 2 
= ACxE 1 + yE 2 ) 

= AX. 

This proves that L(X) = AX, and therefore that A is the matrix represent¬ 
ing L. A similar proof can be given for R 3 , or R n . 

Example. Let L:R 2 —>R 2 bea linear map such that 


HE 1 ) = 


and L(£ 2 ) = 
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A = 


Then the matrix associated with L is the matrix 

'3 - 2 > 

9 / 

You can check that it has the desired effect on the unit vectors, namely: 

'3 — 2\/l\ _ /3\ 

,5 9/VO/ = \5/ 


and 


3 — 2\/O' 

.5 9/\ 1 > 


-2 N 

9j 


Exercises 

1. In each case, find the vector La(X). 

3 


(c) A 


C D-O 
-CD-'-O 


(a) A = ^ 1 » X = 

A 1 


2. Let r be a number. Let F r \ R n 
the formula 


w "-C 5—0 
0-0 

R n be the dilation mapping, defined by 
F r {x) = rX. 


Exhibit a matrix A such that F r (X ) = AX. 

3. Let a, b be numbers. Let F a<b : R 2 —*■ R 2 be the mapping such that 


-0 - O 


Exhibit a matrix A such that F a , b (X) = AX. 

4. Let ai, < 22 , az be numbers. If X = (x,y,z) let F(X) = (aix, a 2 )>, azz). 
Writing X as a column vector, exhibit a matrix A such that F(X) = AX. 

5. Let F(x, y, z ) = ( x , y). Writing X as a column vector, exhibit a matrix A 
such that F(X) = AX. 

6. Let F(x, y, z) = x. Writing X as a column vector, exhibit a matrix A such 
that F(X) = AX. 

1. Let F(x, y, z) = (jc, z). Writing las a column vector, exhibit a matrix A 
such that F(X) = AX. 

8. Same question if F(x, y, z) = (p, z). 



fix, § 2 ] 


LINEAR MAPPINGS 


169 


9. Let F: R 4 —> R 2 be the mapping such that 

F(xi, X2, *3, *4> = (xi,X 2 ). 

Writing A' as a column vector, exhibit a matrix A such that F(X) = AX. 

10. Let F: R 4 —» R 3 be the mapping such that 

F(x l,X 2 ,XZ,Xi) = (xi,x 2 ,xz). 

Writing A' as a column vector, exhibit a matrix A such that F(X) = AX. 

11. Let A be an element of R 3 . Suppose that the translation by A is a linear map. 
What is the only possibility for A ? If A O, can Ta be a linear map ? 
Proof? 

12. Let L: R 2 —» R 2 be the linear map such that 


t(Ei) - o 


and L(E 2 ) = 


■0 


What is the matrix associated with L ? 
13. Same question if 

'- 1 ' 


L(E l ) - 


n 


and L(E 2 ) = 


0 - 


14. Let L: R 3 —» R 3 be a linear map such that 



L(E?) = 




Here, 



What is the matrix associated with LI Verify that it has the desired effect 
on the unit vectors. 

15. Write out the proof that if E 1 , E 2 , E 3 are the standard unit vectors in R 3 , 
and if L: R 3 —* R 3 is the linear map such that 

/an\ / ai2 \ / ai3 \ 

L(E') =[ a 2 i V L(E 2 ) = I a 22 V L(E 3 ) a 23 V 
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then the matrix A associated with L is the matrix (a^), that is 

(a\\ a 12 
<221 <222 023 

,^31 <232 033 / 

16. Let L : R 3 —► R 3 be the linear map such that 

r\ ( 4 ' 

L(E') = 1 5 1» L{E 2 ) = I 1 

What is the matrix associated with L? Verify directly that it has the desired 
effect on the unit vectors. 

17. Let L: R —» R" be a linear map. Prove that there exists a vector A in R” 
such that for all t in R we have 

L(t) = tA. 

18. Let L: R 2 —» R 3 be a linear map. Let 



E 1 = l 1 and E 2 
be the unit vectors in R 2 . Suppose that 


G) 


■0 


( Oll\ /012N 

<221 )’ L(E 2 ) = f <222 

031 / \ 032 / 

In terms of the a ih what is the matrix A associated with L ? 

19. Let L : R 2 —> R 3 be a linear map, and suppose that E 1 , E 2 are the unit 
vectors in R 2 . Let 

n r 

L(E') = 1 1 and L(E 2 ) = | 7 

V—4 / \—8> 

What is the matrix A associated with L ? 

§3. Geometric applications 

Let P, A be elements of R n . We define the line segment between P and 
P + A to be the set of all points 


P + tA, 


0 £ t S 1. 
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This line segment is illustrated in the following picture. 



For instance, if t = then P + \A is the point midway between P and 
P + A. Similarly, if t — then P + is the point one-third of the way 
between P and P + A (Fig. 5). 



If P, Q are elements of R n , let A = Q — P. Then the line segment 
between P and Q is the set of all points P -}- (A, or 

P + t(Q - P), Og/gl. 



Observe that we can rewrite the expression for these points in the form 
(1) (1 - t)P + tQ, 0 S tS 1, 

and letting s — 1 — /, t — 1 — s, we can also write it as 


sP + (1 - s)Q, 


0 ^ s ^ l. 
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Finally, we can write the points of our line segment in the form 
(2) t\P -f- t 2 Q 

with / 1 , t 2 ^ 0 and ?i + t 2 = 1. Indeed, letting t = t 2 , we see that 
every point which can be written in the form (2) satisfies (1). Conversely, 
we let ti = 1 — t and t 2 = t and see that every point of the form (1) 
can be written in the form (2). 

Let L : R n —> R m be a linear map. Let S be the line segment in R n between 
two points P, Q. Then the image L(S) of this line segment is the line seg¬ 
ment in R TO between the points L(P) and L(Q). This is obvious from (2), 
because 

L(hP + t 2 Q ) = t 1 L(P) + t 2 L(Q). 

We shall now generalize this discussion to higher dimensional figures. 
Let P , Q be elements of R n , and assume that they are ^ 0, and Q is not a 
scalar multiple of P. We define the parallelogram spanned by P and Q to 
be the set of all points 

t\P T- t 2 Q, 0 S t{ S 1 for i = 1, 2. 



This definition is clearly justified since t x P is a point of the segment 
between O and P (Fig. 7), and t 2 Q is a point of the segment between 
O and Q. For all values of 1 1 , t 2 ranging independently between 0 and 1, we 
see geometrically that t x v + t 2 w describes all points of the parallelogram. 

At the end of §1 we defined translations. We obtain the most general 
parallelogram (Fig. 8) by taking the translation of the parallelogram just 
described. Thus if A is an element of R n , the translation by A of the paral¬ 
lelogram spanned by P and Q consists of all points 


A -j- t\P -}- t 2 Q, 0 ^ ti ^ 1 for i — 1, 2. 
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As with line segments, we see that if 

L : R n -> R m 


is a linear map, and if S is a parallelogram as described above, then the 
image of S is again a parallelogram, provided that L(P ) and L(Q) do not 
lie on the same line through the origin (i.e. L(P) is not a scalar multiple 
of L(Q)). This is immediately seen, because the image of S under L 
consists of all points 

L(A + t x P + t 2 Q) = L(A ) + t t L(P) + hL(Q), 

with 

0|/^1 for /=1,2. 

We see again the usefulness of the conditions for linearity LM 1 and LM 2. 

Example. Let S be the parallelogram spanned by the vectors P = (1,2) 
and (— 1, 5). Let L : R 2 —> R 2 be the linear map La , where A is the matrix 



Then, writing P, Q as vertical vectors, we obtain 



Hence the image of S under L is the parallelogram spanned by the vectors 
(5, 9) and (2, 26). 
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On the next figure, we have drawn a typical situation of the image of a 
parallelogram under a linear map. 



L 



UP) 


Figure 


9 


A similar discussion can be carried out in 3-space. It is good practice 
for you to write it up yourself. Do Exercise 5. 


Exercises 

1. Let L be the linear map represented by the matrix 



Let S be the line segment between P and Q. Draw the image of S under L, 
indicating L(P ) and L(Q ) in each of the following cases. 

(a) P = (2,1) and Q = (-1,1) 

(b) P = (3, — 1) and Q = (1,2) 

(c) P — (1,1) and Q= (1,-1) 

(d) P = (2, — 1) and Q = (1,2) 


2. In cases (a), (b), (c), and (d) of Exercise 1, let Tbe the parallelogram spanned 
by P and Q. Draw the image of T by the linear map L of Exercise 1, indicat¬ 
ing in each case L(P) and L(Q). 


3. Let E 1 = 




be the standard unit vectors. 


Write down 


their images under the linear map L represented by the matrix 




Let S be the square spanned by E l and E 2 . Draw the image of this square 
under L, indicating L(E l ) and L(E 2 ). 

4. Let E 1 , E 2 again be the standard unit vectors, drawn vertically. Let L be 
the linear map represented by the matrix 
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Let S be the square spanned by E 1 , E 2 . Draw the image L(S), again 
indicating LCE 1 ) and L(E 2 ). 

5. (a) Give a definition of the box (parallelepiped) spanned by three vectors 

A, B, C in R 3 . 

(b) Let L: R 3 -» R 3 be a linear map. Prove that the image of such a box 
under L is again a box, spanned by L(A ), L(B), L(C ) (provided that the 
segments from O to L(A), L(B), L(C), respectively, do not all lie in a 
plane, otherwise you get a “degenerate” box). 

(c) Draw a picture for this in 3-dimensional space. 

6. Let L be the linear map of R 3 into itself represented by the matrix 



Let S be the cube spanned by the three unit vectors E l , E 2 , E' s . Give 
explicitly three vectors spanning L(S ). 

7. Same questions as in Exercise 6, if L is represented by the matrix 



8. Let X(t) — P + tA, with t in R, be the parametrization of a straight line in 
R”. Let L: R n —> R m be a linear map. Suppose that L(A) ^ O. Prove that 
the image of the straight line is a straight line. 

9. Let S' be a line passing through two distinct points P and Q, in R". Let L : 
r« —» R TO be a linear map, such that L(P) ^ L(Q). 

(a) Give a parametric representation of the line S. 

(b) Give a parametric representation of the line L(S ). 

10. Let A, B be non-zero vectors in R n and assume that neither is a scalar 
multiple of the other. Such vectors are called independent. We define the 
plane spanned by A and B to be the set of all points 

tA + sB, 

for all real numbers t, s. Observe that this is the 2-dimensional analogue of 
the parametrization of a line. Let L: R n —> R m be a linear map. Assume that 
L{A ) and L(B) are independent. Prove that the image of the plane spanned 
by A and B is a plane (spanned by which vectors ?). 

11. Let A, B be independent vectors in R n , and let P be a point. We define the 
plane through P parallel to A, B to be the set of all points 


P + tA + sB, 
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where t, s range over all real numbers. Let L: R" —> R m be a linear map 
such that L(A ) and L(B ) are independent. Prove that the image of the 
preceding plane is also a plane. 

The plane of Exercise 10 looks like this. 



It is the translation by P of the plane in Exercise 9. 

§4. Composition and inverse of mappings 

This section will be useful for Chapter XI, §3, §4 and Chapter XIII. 
You can omit these without impairing your understanding of the rest of 
the book. 

Before we discuss linear mappings, we have to make some more remarks 
on mappings in general. You recall that in studying functions of one 
variable, you met composite functions and the chain rule for differentiation. 
We shall meet a similar situation in several variables. 

In one variable, let 

/: R —> R and g: R —» R 

be functions. Then we can form the composite function g of defined by 

(g °/) 0 ) = g(f(x)). 

Let U, V, W be sets. Let 

F: U —> V and G: V —> W 

be mappings. Then we can form the composite mapping from U into W, 
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denoted by G° F. It is by definition the mapping defined by 

(G o Fj(u) = G(F(u)) 

for all u in U. 

Example. Let G: R 2 —> R 2 be the mapping such that 

G(Y ) =37. 

Let F : R 2 —> R 2 be^he mapping such that F(X ) = X + A, where 
! A = (1, -2). 

Then 

G(F(Z)) = G(X + A) = (X + A) = 3X + 3 A. 

Our mapping G ° F is the composite of a translation and a dilation. 

Example. Let G: R 2 —» R 3 be the mapping such that 

G(x, y ) = (x 2 , xy, sin y). 

If («, v, w) are the coordinates of R 3 , we may set 

u = x 2 , v = xy, w = siny. 

Let F: R 3 —* R 3 be the mapping such that 

F(u, v, w) = (m 3 , uv, uvp). 

Then 

F(G(x, ^)) = (x 6 , x 3 y, xy sin y). 

The composition of mappings is associative. More precisely, let U, V, W, 
S be sets. Let 

F'.U—*V, G: V —> W, and H:W-+S 
be mappings. Then 

Ho(GoF) = (Ho G)oF. 

Proof. Here again, the proof is very simple. By definition, we have, 
for any element u of U: 

(Ho(GoF))(u) = H((GoF)(u)) = H(G(F(u))). 
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On the other hand, 

((HoG)oF)(u) = (HoG)(F(u)) = H(G(F(u))). 

By definition, this means that ( H°G)°F = H°{G°F). 

If S is any set, the identity mapping I s is defined to be the map such 
that Is(x) = x for all * e S. We note that the identity map is both 
injective and surjective. If we do not need to specify the reference to 5 
(because it is made clear by the context), then we write / instead of Is. 
Thus we have I(x) = x for all x e S. 

Finally, we define inverse mappings. Let F: S —> S' be a mapping 
from one set into another set. We say that F has an inverse if there exists 
a mapping G: S' —> S such that 

G°F = Ids and F°G = Ids'. 

By this we mean that the composite maps G°F and F° G are the identity 
mappings of S and S' respectively. 

Example. Let S = S' be the set of all numbers ^ 0. Let 

/: S-+S' 

be the map such that f(x) = x 2 . Then /has an inverse mapping, namely 
the map g: S —> S such that g(x) = y/x. 

Example . Let R + be the set of numbers > 0 and let /: R —* R + be 
the map such that f(x ) = e x . Then / has an inverse mapping which is 
nothing but the logarithm. 

Example. Let A be a vector in R 3 and let 

T a : R 3 -> R 3 

be the translation by A. By definition, we recall that this means 

T a (X) = X + A. 

If B is another vector in R 3 , then the composite mapping Tb°Fa has the 
value 

(T b ° T a )(X) = T b (T a (X » 

= T b (X + A) 

= X + A + B. 

If B — —A, we see that 

T_ a {Ta(X)) = x+a-a = x. 
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and similarly that Ta(T-a(X )) = X. Hence T_ A is the inverse mapping 
of Ta . In words, we may say that the inverse mapping of translation by A 
is translation by — A. Of course, the same holds in R n . 


Let 



Figure 10 




be a map. We say that / is injective if whenever x, y e S and x ^ y, 
then f(x) f(y). In other words, / is injective means that / takes on 
distinct values at distinct elements of S. For example, the map 


/: R —* R 

such that fix) = x 2 , is not injective, because/(1) =/(— 1) = 1. Also 
the function x t—► sin x is not injective, because sin x = sin (x + 27r). 
However, the map/: R —» R such that/(x) = x + 1 is injective, because 
if x + 1 = y -j- 1, then x = y. 

Again, let /: 5 —* S' be a mapping. We shall say that / is surjective if 
the image of f is all of S'. Again, the map 

/: R —> R 

such that f(x) = x 2 , is not surjective, because its image consists of all 
numbers ^ 0, and this image is not equal to all of R. On the other hand, 
the map of R into R given by x i—» x 3 is surjective, because given a num¬ 
ber y there exists a number x such that y = x 3 (the cube root of y). Thus 
every number is in the image of our map. 

Let R + be the set of real numbers ^0. As a matter of convention, we 
agree to distinguish between the maps 

R —> R and R + R + 


given by the same formula x ■—> x 2 . The point is that when we view the 
association x ■—► x 2 as a map of R into R, then it is not surjective, and it 
is not injective. But when we view this formula as defining a map from 
R + into R + , then it gives both an injectiye and surjective map of R^ 
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into itself, because every positive number has a positive square root, and 
such a positive square root is uniquely determined. 

In general, when dealing with a map/: S'—> S', we must therefore always 
specify the sets S and S', to be able to say that /is injective, or surjective, 
or neither. To have a completely accurate notation, we should write 

fs,s’ 

or some such symbol which specifies S' and S' into the notation, but this 
becomes too clumsy, and we prefer to use the context to make our meaning 
clear. 

Let 

/: S-*S' 

be a map which has an inverse mapping g. Then f is both injective and 
surjective. 

Proof. Let xjeS and x ^ y. Let g: S' —> 5 be the inverse mapping 
of/. If f(x) = f{y), then we must have 

x = g(f(xj) = g(fiy)) = y, 

which is impossible. Hence f(x) ^ fiy), and therefore/ is injective. To 
prove that/ is surjective, let z e S'. Then 

/($(*)) = z 

by definition of the inverse mapping, and hence z = fix), where x = g(z). 
This proves that/ is surjective. 

The converse of the statement we just proved is also true, namely: 

Let f:S—+S'bea map which is both injective and surjective. Then f 
has an inverse mapping. 

Proof. Given z e S', since / is surjective, there exists x e S such that 
f{x) = z. Since/ is injective, this element x is uniquely determined by z, 
and we can therefore define 


g(z) = x - 

By definition of g, we find that /(g(z)) = z, and g(/(x)) = x, so that g is 
an inverse mapping for/. 

Thus we can say that a map f: S —» S' has an inverse mapping if and 
only if f is both injective and surjective. 
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Using another terminology, we can also say that a map 

/: 5 -> S' 

which has an inverse mapping establishes a one-one correspondence 
between the elements of S and the elements of S'. 

We shall, of course, be mostly concerned with linear mappings. 

Let F : R n —■> R m and G: R m —> R s be linear maps. Then the composite 
map G ° F is also a linear map. 

Proof. This is very easy to prove. Let u, v be elements of U. Since F 
is linear, we have F(u + v) = F(u ) + F(v). Hence 

(GoF)(u + v) = G(F(u v)) = G(F(u) + F(v)). 

Since G is linear, we obtain 

G(F(u) + F(v)) = G(F(u)) + G(F(v)). 

Hence 

(G°F)(u + v) = (GoF)(u) + (GoF)(u). 

Next, let c be a number. Then 
(GoF)(cu) = G(F(cu)) 

= G{cF(u )) (because F is linear) 

= cG(F(u)) (because G is linear). 

This proves that G°F is a linear mapping. 

We can also see this with matrices. Suppose that A is the matrix 
associated with F, and B is the matrix associated with G. Then by defini¬ 
tion, we have 

F(X) = AX, for X in R”, 

and 

G(Y)=BY, for Y in R m . 

Hence 

G(F(X)) = B(AX) - (BA)X, 

and we see that the product BA is the matrix associated with the linear 
map G o F. In other words, the product of the matrices associated with G 
and F, respectively, is the matrix associated with G ° F. 

Let F: R n —> R n be a linear mapping. We shall say that F is invertible 
if there exists a linear mapping 


R n 


G: R n 
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such that G ° F = / and F ° G = I. [It can be shown that if an inverse for 
F exists as a mapping, then this inverse is necessarily linear, but we don’t 
give the proof. It is an easy exercise.] Similarly, let A be an n X n matrix. 
We say that A is invertible if there exists an n X n matrix B such that 
AB = BA = I n is the unit n X n matrix. We denote B by A~ 1 . 

If F is a linear mapping as above, then we know that it has an associated 
matrix A, such that 

F(X) = AX, all Af in R”. 

Suppose that F is invertible, and that G is its inverse linear mapping. Then 
G also has an associated matrix B, and since G(F(X)) = X, we must have 

BAX = X, 

for all X in R n . Similarly, we must also have ABX = X for all X in R”. 
In particular, this must be true if X is any one of the standard unit vectors, 
and from this we see that AB = BA = I n is the unit n X n matrix. Thus 
B = A~ x . In other words: 

If A is the matrix associated with an invertible linear mapping L: R n —> R”, 
then A~ l is the matrix associated with the inverse of L. 

It is usually a tedious process to find the inverse of a matrix, and this 
process involves linear equations. For 2X2 matrices, however, the 
process is short. We shall discuss it in connection with determinants. 


Exercises 

1. Let F : R 3 —> R 3 be the map such that F(X) = IX. Prove that F has an 
inverse mapping, and that this inverse is linear. Do the same if F: R” —» R n 
is defined by the same formula. 

2. Let F: R n —» R n be the map such that F(X) = —8AT. Prove that F is 
invertible, and write down its inverse explicitly. 

3. Let c be a number X 0 and let L : R" —» R" be the map such that F(X) = cX. 
Prove that L has an inverse linear map, and write it down explicitly. 

4. Let A, B, C be square matrices of the same size and assume that they are 
invertible. Prove that AB is invertible, and express its inverse in terms of 
A~ x and B~ x . Also show that ABC is invertible. 

5. Let A be a square matrix such that A 2 = 0. Show that I-A is invertible. 
(/ is the unit matrix of the same size as A.) 

6. Let A be a square matrix such that A 2 + 2A -f- / = 0. Show that A is 
invertible. 

7. Let A be a square matrix such that A' 6 = 0. Show that I-A is invertible. 



CHAPTER X 


Determinants 


In this chapter we carry out the theory of determinants for the case of 
2X2 and 3X3 matrices. 

§1. Determinants of order 2 


Let 



be a 2 X 2 matrix. We define its determinant to be ad — cb. Thus the 
determinant is a number. We denote it by 


a 

c 


b 

d 


= ad — be. 


For example, the determinant of the matrix 



is equal to 2 ■ 4 — 1 • 1 = 7. The determinant of 



is equal to (—2) • 5 — (—3) • 4 = — 10 + 12 = 2. 


Theorem 1. If A is a 2 X 2 matrix, then the determinant of A is equal to 
the determinant of the transpose of A. In other words, 

D(A) = D{ 1 A). 

Proof. This is immediate from the definition of the determinant. We 
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have 


and 


\A\ = 


a b 
c d 


and 


\*A\ = 


a c | 
b d 


ad — be = ad — cb. 


Of course, the property expressed in Theorem 2 is very simple. We give 
it here because it is satisfied by 3 X 3 determinants which will be studied 
later. 

Consider a 2 X 2 matrix A with columns A 1 , A 2 . The determinant 
D(A) has interesting properties with respect to these columns, which we 
shall describe. Thus it is useful to use the notation 


D(A) = D(A\A 2 ) 


to emphasize the dependence of the determinant on its columns. If the two 
columns are denoted by 



then we would write 


D{B , C) = 


b i 
b 2 


Cl 

c 2 


b 1 c 2 - c x b 2 . 


We may view the determinant as a certain type of “product” between the 
columns B and C. To what extent does this product satisfy the same rules 
as the product of numbers. Answer: To some extent, which we now 
determine precisely. 

To begin with, this “product” satisfies distributivity. In the determinant 
notation, this means: 

Dl. If B = B' + B", i.e. 

(bi\ = /b[\ /b\'\ } 

W W + W ’ 

then 

D(B' + B", C ) = D(B’, C ) + D(B", C ). 

Similarly, if C = C r + C", then 

D(B, C + C") = D(B, C') + D(B, C"). 
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Proof. Of course, the proof is quite simple using the definition of the 
determinant. We have 


D(B' + B ", C) 


b[ + by C 1 

b 2 ~b b'{ c 2 

( b[ + b[')c 2 - (b' 2 + b'i)c x 
b[c 2 + b'{c 2 - b' 2 c x - b 2 Cx 
D(B', C ) + D(B", C). 


Distributivity on the other side is proved similarly. 
D2. If x is a number, then 


D(xB, C) = x ■ D(B, C ) = D(B, xC). 


Proof We have 


D(xB, C ) = 


xb i 
xb 2 


c i 
c 2 


*&ic 2 — x6 2 ci = *(6ic 2 — 62C1) 
= xD(B, C). 


Again, the other equality is proved similarly. 

Properties D1 and D2 may be expressed by saying that the determinant 
is linear as a function of each column. 

D3. If the two columns of the matrix are equal, then the determinant is 
equal to 0. In other words. 


D(B, B ) = 0. 


Proof. This is obvious, because 

\bi bx\ 


b 2 b 2 


= b x b 2 — b 2 bi - 0. 


The two vectors 


E 1 = [ ] and E 2 = 

<0 j 


. 1 , 


are the standard unit vectors. The matrix formed by them, namely 


E = 


1 0 
0 1 
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is the unit matrix. We have: 

D4. If E is the matrix, then D(E ) = D(E X , E 2 ) = 1. 

This is obvious. 

These four basic properties are fundamental, and other properties can 
be deduced from them, without going back to the definition of the 
determinant in terms of the components of the matrix. 

D5. If we add a multiple of one column to the other, then the value of the 
determinant does not change. In other words, let x be a number. 
Then 

D(B + xC, C ) = D(B, C ) and D(B, C + xB) = D(B, C ). 

Written out in terms of components, the first relation reads 

b i + xci Ci b i Ci 

b 2 + xc 2 c 2 b 2 c 2 

Proof. Using Dl, D2, D3 in succession, we find that 

D(B + xC, C) = D(B, C ) + D(xC, C ) 

= D(B, C ) + xD(C, C ) = D(B, Q. 

A similar proof applies to D(B, C + xB). 

D6. If the two columns are interchanged, then the value of the determinant 
changes by a sign. In other words, we have 

D(B, C) = - D(C, B ). 

Proof Again, we use Dl, D2, D4 successively, and get 

0 = D(B + C, B + C) = D(B, B + C) + Z>(C, B + C) 

= D(B, B) + D(B, C ) + D(C, B) + D(C, C ) 
= D{B, C) + D(C, B ). 

This proves that D(B, C) = — D(C, B), as desired. 

Of course, you can also give a proof using the components of the matrix. 
Do this as an exercise. However, there is some point in doing it as above, 
because in the study of determinants in the higher-dimensional case later, a 
proof with components becomes much messier, while the proof following 
the same pattern as the one we have given remains neat. 
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Exercises 


1. Compute the following determinants. 



3 -5 


2 

-1 

(a) 

4 2 

(b) 

-3 

4 


-5 3 


3 

3 

(d) 

4 6 

(e) 

-7 

-8 


2. Compute the determinant 



4 

-1 

-4 

3 


cos 9 — sin 8 

sin 9 cos 0 


for any real number 0. 

3. Compute the determinant 


cos 0 sin 6 
sin 9 cos 9 


when 


(a) 9 = ir, (b) 9 = 7t/2, (c) 9 = t/ 3, 


(d) 9 = tt/4. 


4. Prove: 

(a) The other half of Dl. 

(b) The other half of D2. 

(c) The other half of D5. 

5. Let c be a number, and let A be a 2 X 2 matrix. Define cA to be the matrix 
obtained by multiplying all components of A by c. How does D(cA) differ 
from D(A)? 


§2. Determinants of order 3 

We shall define the determinant for 3 X 3 matrices, and we shall see 
that it satisfies properties analogous to those of the 2X2 case. 

Let 


ja ii 

A = ( a {j ) = I a 2 i 

\«31 


a 12 
«22 
«32 



be a 3 X 3 matrix. We define its determinant according to the formula 



188 


DETERMINANTS 


[X, §2] 


known as the expansion by a row, say the first row. That is, we define 

#22 #23 a 21 #23 a 2 \ a 22 

(1) E>(A) = «u ~ £*12 + #13 

#32 #33 #31 #33 #31 #32 

and we denote D(A ) also with the two vertical bars 

#11 #12 #13 

D(A) = a 21 #2 2 #23 

#31 #32 #33 

We may describe the sum in (1) as follows. Let Aij be the matrix obtained 
from A by deleting the z'-th row and the y'-th column. Then the sum for 
D(A) can be written as 

#llZ>(v 4 n) — # 1 2-^0^ i 2) + # 1 3^(A 1 3). 

In other words, each term consists of the product of an element of the first 
row and the determinant of the 2X2 matrix obtained by deleting the first 
row and the y-th column, and putting the appropriate sign to this term as 
shown. 


Example. Let 



Then 





and our formula for the determinant of A yields 

14 14 11 

D(A ) - 2 - 1 +0 

25 —35 —32 

= 2(5 - 8) - 1(5 + 12) + 0 

= -23. * 
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Thus the determinant is a number. To compute this number in the above 
example, we computed the determinants of the 2X2 matrices explicitly. 
We can also expand these in the general definition, and thus we find a six- 
term expression for the determinant of a general 3X3 matrix A = (a t j), 
namely: 


( 2 ) 


D(A) — 011^22^33 — fl ll°32^23 — 012021033 

+ 012023031 + fll3 fl 21«32 — 013022031- 


Do not memorize (2). Remember only (1), and write down (2) only when 
needed for specific purposes. 

We could have used the other rows to expand the determinant, instead 
of the first row. For instance, the expansion according to the second row 
is given by 



^12 

013 

+ 022 

011 

013 


011 

012 

#21 





— 023 




032 

033 


031 

033 


03 1 

032 


— — d2lD(A2l) + d 22 D(A 22 ) ~ a 23^(^2s)- 


Again, each term is the product of a 2 j with the determinant of the 2X2 
matrix obtained by deleting the second row andy-th column, together with 
the appropriate sign in front of each term. This sign is determined accord¬ 
ing to the pattern: 



If you write down the two terms for each one of the 2X2 determinants in 
the expansion according to the second row, you will obtain six terms, and 
you will find immediately that they give you the same value which we wrote 
down in formula (2). Thus expanding according to the second row gives 
the same value for the determinant as expanding according to the first row. 

Furthermore, we can also expand according to any one of the columns. 
For instance, expanding according to the first column, we find that 



022 

023 


012 

013 

+ 031 

012 

013 

011 



— 02 1 






032 

033 


032 

033 


022 

023 
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yields precisely the same six terms as in (2), if you write down each one of 
the two terms corresponding to each one of the 2X2 determinants in the 
above expression. 

Example. We compute the determinant 

3 0 1 
1 2 5 
-14 2 

by expanding according to the second column. The determinant is equal to 

= 2(6 - (-1)) - 4(15 - 1) = -42. 

Note that the presence of 0 in the first row and second column eliminates 
one term in the expansion, since this term is equal to 0. 

If we expand the above determinant according to the third column, we 
find the same value, namely 



1 2 


3 0 


3 0 

+1 

-1 4 

- 5 

-1 4 

+ 2 

1 2 


Theorem 2. If A is a 3 X 3 matrix, then D(A ) = D( l A). In other words, 
the determinant of A is equal to the determinant of the transpose of A. 

Proof This is true because expanding D(A) according to rows or 
columns gives the same value, namely the expression in (2). 


3 1 
1 2 


- 4 


Exercises 

1. Write down the expansion of a 3 X 3 determinant according to the third 
row, the second column, and the third column, and verify in each case that 
you get the same six terms as in (2). 

2. Compute the following determinants by expanding according to the second 
row, and also according to the third column, as a check for your computation. 
Of course, you should find the same value. 



2 

1 

2 


3 

-1 

5 


2 

4 

3 

(a) 

0 

3 

-1 

(b) 

-1 

2 

1 

(c) 

-1 

3 

0 


4 

1 

1 


-2 

4 

3 


0 

2 

1 
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1 

2 

-1 


-1 

5 

3 


3 

1 

2 

(d) 

0 

1 

1 

(e) 

4 

0 

0 

(0 

4 

5 

1 


0 

2 

7 


2 

7 

8 


-1 

2 

-3 


3. Compute the following determinants. 



4 

0 

0 


-3 

0 

0 


6 

0 

0 

(a) 

0 

5 

0 

(b) 

0 

5 

0 

(c) 

0 

5 

0 


0 

0 

7 


0 

0 

-8 


0 

0 

-2 


4. Let a, b, c be numbers. In terms of a, b, c, what is the value of the determinant 


a 0 0 


0 6 0 
0 0c 


? 


5. Find the determinants of the following matrices. 


(a) 


(c) 


(e) 



(g)| 0 
^0 




6. In terms of the components of the matrix, what is the value of the deter¬ 
minant : 



an 

a 12 

<213 



an 

0 

0 

(a) 

0 

022 

023 

? 

(b) 

a 21 

a 22 

0 


0 

0 

033 



031 

032 

033 
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§3. Additional properties of determinants 

We shall now see that 3X3 determinants satisfy the properties D1 
through D6, listed previously for 2 X 2 determinants. These properties 
are concerned with the columns of the matrix, and hence it is useful to use 
the same notation which we used before. If A 1 , A 2 , A 3 are the columns of 
the 3X3 matrix A, then we write 

D(A) = D(A\A\A 3 ). 

For the rest of this section, we assume that our column and row vectors 
have dimension 3; that is, that they have three components. Thus any 
column vector B in this section can be written in the form 



Dl. Suppose that the first column can be written as a sum, 

A 1 = B+ C, 


that is, 



Then 


D(B + C, A 2 , A 3 ) = D(B, A 2 , A 3 ) + D(C, A 2 , A 3 ). 

and the analogous rule holds with respect to the second and third 
columns. 


Proof. We use the definition of the determinant, namely the expansion 
according to the first row. We see that each term splits into a sum of two 
terms corresponding to B and C. For instance, 



#22 

#23 


#22 

#23 


#22 

#23 

# 11 

#31 

#33 

= bi 

#31 

#33 

+ Cl 

#31 

#33 



b 2 

+ C2 

#23 



#23 

+ #12 

C 2 

#23 

#12 

bz 

+ ^3 

#33 

— #12 

b 3 

#33 

C3 

#33 
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b 2 

+ C 2 

022 


b 2 

0 22 

+ 013 

c 2 

022 

013 

bz 

+ £*3 

032 

— 013 

b 3 

032 

C 3 

032 



Summing with the appropriate sign yields the desired relation. 
D2. Ifx is a number, then 

D(xA\A 2 ,A 3 ) = x - D(A\A 2 ,A 3 ), 

and similarly for the other columns. 

Proof. We have 


XU 21 u 2 2 
*<*31 a 32 

= x- D(A\ A 2 , A 3 ). 

The proof is similar for the other columns. 


D(xA l , A 2 , A 3 ) = xa\\ 


022 

023 


JC021 

023 

+ 013 



— a 12 



032 

033 


*031 

033 



D3. If two columns of the matrix are equal, then the determinant is equal 
to 0. 

Proof. Suppose that A 1 = A 2 , and look at the expansion of the deter¬ 
minant according to the first row. Then flu = a\ 2 , and the first two 
terms cancel. The third term is equal to 0 because it involves a 2 X 2 
determinant whose two columns are equal. The proof for the other cases 
is similar. (Other cases: A 2 = A 3 and A 1 = A 3 .) 

In the 3X3 case, we also have the unit vectors, namely 



and the unit 3X3 matrix, namely 


( i o o N 
0 1 0 
0 0 1 , 

D4. If E is the unit matrix, then D(E) = D(E l , E 2 , E 3 ) 


= 1. 


Proof. This is obvious from the expansion according to the first row. 
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Observe that to prove our basic four properties, we needed to use the 
definition of the determinant, i.e. its expansion according to the first row. 
For the remaining properties, we can give a proof which is not based 
directly on this expansion, but only on the formalism of D1 through D4. 
This has the advantage of making the arguments easier, and in fact of 
making them completely analogous to those used in the 2 X 2 case. We 
carry them out. 

D5. If we add a multiple of one column to another, then the value of the 
determinant does not change. In other words, let xbe a number. Then 
for instance. 


D(A 1 , A 2 + xA *, A 3 ) = D(A x , A 2 , A 3 ), 

and similarly in all other cases. 

Proof We have 

D(A X ,A 2 , + xA\A 3 ) = D(A X ,A 2 ,A 3 ) + D(A\xA\A 2 ) (by Dl) 

= D(A x ,A 2 ,A 3 ) + x-D( < A l ,A x ,A 3 ) (by D2) 
= D(A\A 2 ,A 3 ) (by D4). 

This proves what we wanted. The proofs of the other cases are similar. 

D6. If two adjacent columns are interchanged, then the determinant 
changes by a sign. In other words, we have 

D(A X , A 3 , A 2 ) = ~D(A\A 2 ,A 3 ), 

and similarly in the other case. 

Proof. We use the same method as before. We find 

0 = D(A 1 , A 2 + A 3 , A 2 + A 3 ) 

= D(A X ,A 2 ,A 2 + A 3 ) + D(A X , A 3 , A 2 + A 3 ) 

= D(A X , A 2 , A 2 ) + D(A X , A 2 , A 3 ) + D(A X ,A 3 ,A 2 ) + D(A X ,A 3 ,A 3 ) 
= D(A l , A 2 , A 3 ) + D(A l , A 3 , A 2 ), 

using Dl and D3. This proves D6 in this case, and the other cases are 
proved similarly. 

Using these rules, especially D5, we can compute determinants a little 
more efficiently. For instance, we have already noticed that when a 0 
occurs in the given matrix, we can expand according to the row (or column) 
in which this 0 occurs, and it eliminates one term. Using D5 repeatedly, 
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we can change the matrix so as to get as many zeros as possible, and then 
reduce the computation to one term. 

Furthermore, knowing that the determinant of A is equal to the 
determinant of its tranpose, we can also conclude that properties D1 
through D6 hold for rows instead of columns. For instance, we can state 
D6 for rows: 

If two adjacent rows are interchanged, then the determinant changes by a 

sign. 

As an exercise, state all the other properties for rows. 


Example. Compute the determinant 

3 0 1 
12 5- 
-14 2 

We already have 0 in the first row. We subtract two times the second row 
from the third row. Our determinant is then equal to 

3 0 1 

12 5 • 

-3 0 -8 


We expand according to the second column. The expansion has only one 
term 0, with a + sign, and that is: 



The 2X2 determinant can be evaluated by our definition of ad — be, and 
we find the value 


2(—24 - (-3)) = -42. 

Example. We compute the determinant 

4 7 10 

3 7 5- 

5 -1 10 
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We subtract two times the second row from the first row, and then from the 
third row, yielding 

-2 -7 0 

3 7 5 * 

-1 -15 0 

which we expand according to the third column, and get 

-5(30 - 7) = -5(23) 

= -115. 


Note that the term has a minus sign, determined by our usual pattern of 
signs. 

Determinants can also be defined for n X n matrices, satisfying 
analogous properties to D1 through D6. The proofs are similar, but in¬ 
volve sometimes more complicated notation, so we shall not go into them. 


Exercises 

1. (a) Write out in full and prove property D1 with respect to the second 

column and the third column. 

(b) Same thing for property D2. 

2. Prove the two cases not treated in the text for property D3. 

3. Prove D5 in the case 

(a) you add a multiple of the third column to the first; 

(b) you add a multiple of the second column to the first; 

(c) you add a multiple of the third column to the second. 

4. If you interchange the first and third columns of the given matrix, how does 
its determinant change? What about interchanging the first and third row? 

5. Compute the following determinants. 



2 

1 

2 


3 

-1 

5 


2 

4 

3 

(a) 

0 

3 

-1 

(b) 

-1 

2 

1 

(c) 

-1 

3 

0 


4 

1 

1 


-2 

4 

3 


0 

2 

1 



1 

2 

-1 


-1 

5 

3 


3 

1 

2 

(d) 

0 

1 

1 

(e) 

4 

0 

0 

(0 

4 

5 

1 


0 

2 

7 


2 

7 

8 


-1 

2 

-3 
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6. Compute the following determinants. 




1 


1 3 


3 

2 

1 



3 

1 

1 

(a) 

— 

1 


1 0 

(b) 

4 

1 

2 


(c) 

2 

5 

5 



1 


2 5 


1 

5 

7 



8 

7 

7 


4 


-9 

2 


4 

— 

1 

1 


2 

0 

0 

(d) 

4 


-9 

2 

(e) 

2 


0 

0 

(f) 

1 

1 

0 


3 


1 

0 


1 


5 

7 


8 

5 

7 


4 


0 

0 


5 

0 

0 



2 

— 

1 4 

(g) 

0 


1 

0 

00 

0 

3 

0 


0) 

3 


1 5 


0 


0 

27 


0 

0 

9 



1 


2 3 


7. In general, what is the determinant of a diagonal matrix 

an 0 0 

0 <322 0 ? 

0 0 <333 

8. Compute the following determinants, making the computation as easy as 
you can. 



4 

-9 

2 


4 -1 


1 


2 - 

1 

4 

(a) 

4 

-9 

2 

(b) 

2 0 


0 

(c) 

1 

1 

5 


3 

1 

5 


1 5 


7 


1 

2 

3 


3 

1 

1 


2 1 


1 


-4 

4 

2 

(d) 

2 

5 

5 

(e) 

3 1 


5 

(0 

5 

1 

3 


8 

7 

7 


4 -2 


3 


2 

1 

4 


7 

3 

2 


3 

2 

1 


-2 

-1 

1 

(g) 

1 

-1 

1 

(h) 

1 

1 

1 

(i) 

3 

1 

-1 


2 

1 

3 


-1 

3 

4 


-1 

2 

3 


2 

1 1 



-4 

1 

2 


-1 

3 

2 

(j) 

1 

1 1 


(k) 

3 

2 

1 

(1) 

3 

-1 

1 


2 

2 2 



-1 - 

-1 

1 


6 

-2 

2 
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9. Let c be a number and multiply each component a„ of a 3 X 3 matrix A 
by c, thus obtaining a new matrix which we denote by cA. How does D(A) 
differ from D(cA)l 

10. Let xi, X 2 , xs be numbers. Show that 


1 

Xl 

v 2 

1 

X2 

v 2 

x 2 

1 

xs 

v 2 

*3 


( X2 ~ Xi)(X3 — X 2 )(X3 - xi). 


11. Suppose that A 1 is a sum of three columns, say 

A 1 = B 1 + B 2 + B 3 . 

Using D1 twice, prove that 

DiB 1 + B 2 + B 3 ,A 2 ,A 3 ) 

= D(B\ A 2 , A 3 ) + D(B 2 , A 2 , A 3 ) + D(B 3 , A 2 , A 3 ). 

Using summation notation, we can write this in the form 

DiB 1 + B 2 + B\ A 2 , X 3 ) = X) W, A 2 , A 3 ), 

3 — 1 

which is shorter. In general, suppose that 


A 1 = E B j 
;=1 

is a sum of n columns. Using the summation notation, express similarly 

D(A\A 2 ,A 3 ) 

as a sum of (how many ?) terms. 

12. Let Xj (y'= 1, 2, 3) be numbers. Let 

A 1 — xi C 1 + X2 C 2 + X3 C 3 . 

Prove that 

3 

D(A\ A 2 , A 3 ) = X) XjD(C\ A 2 , A 3 ). 

;=i 

State and prove the analogous statement when 

A 1 = 2 x J cJ - 
;=1 

13. State the analogous property to that of Exercise 12 with respect to the 
second column. Then with respect to the third column. 
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14. If a(t), bit ), cit), d(t ) are functions of t, one can form the determinant 

a(t) b(t) 
c(0 d{t) ’ 

just as with numbers. Write out in full the determinant 

sin t cos t 
— cos t sin t 

15. Write out in full the determinant 

t + 1 t - 1 
t 2t + 5 

16. Let /(/), git ) be two functions having derivatives of all orders. Let (fit) be 
the function obtained by taking the determinant 

fit) git ) . 
fit) g'( t) 

Show that 

V '( t ) = /(,) g(,) , 

/"(<) *"(0 

i.e. the derivative is obtained by taking the derivative of the bottom row. 

17. Let 

b\it) ci(r)\ 
bzit) c 2 (f)/ 

be a 2 X 2 matrix of differentiable functions. Let B(t) and C(0 be its 
column vectors. Let 

(fit) = Det(/1(0). 

Show that 

(f’it) = D(B'it), Cit)) + D(Bit), Cit)). 

§4. Independence of vectors 

In the geometric applications of Chapter IX, we studied parallelograms 
and parallelotopes spanned by vectors. Let us look at the situation in 
3-space. Let A, B, C be vectors in R 3 , and suppose that A, B are indepen¬ 
dent. We define the plane spanned by A and B to be the set of all points 




xA + yB, 
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with all real numbers x, y. When * = y = 0 we obtain the origin, so the 
plane passes through the origin and looks like this. 



We say that C is independent of A and B if C does not lie in the above plane, 
i.e. if C cannot be written in the form 

C = xA + yB 

with some numbers x and y. Geometrically, this means that C points in a 
direction outside the plane, as shown on Fig. 2. 



We shall now see that the determinant gives us a criterion when C is 
independent of A and B. 

Theorem 3. Let A, B, C be in R 3 . If D(A, B, C ) ^ 0, then C is indepen¬ 
dent of A and B. 

Proof Suppose that C = xA + yB with some numbers x, y. Then 

D(A, B, C) = D(A, B, xA + yB) 

= D(A , B, xA) + D(A, B, yB) 

= xD(A, B, A) + yD(A, B, B) 

= 0 (why?). 


This is against our hypothesis, and thus proves our theorem. 





[X, §5] 


DETERMINANT OF A PRODUCT 


201 


Exercises 

In the following exercises, let A, B, C be in R 3 and assume that the determinant 
D(A, B, C ) is 5 ^ 0. Prove 

1. There is no number x such that B = xA. 

2. There is no number x such that B = xC. 

3. A is independent of B and C. 

4. B is independent of A and C. 

5. Let x, y, z be numbers such that xA + yB -f zC = 0. Then* = y = z = 0. 

6 . Draw a picture of the set of all points 

xA + yB + zC, 

with 0^*^l,0^y^l, and 0 ^ z ^ 1, in 3-space. This set is called 
the box (or parallelotope) spanned by A, B, C. 

§5. Determinant of a product 
Theorem 4. Let A, B be 3 X 3 matrices. Then 

D(AB) = D(A)D(B). 

In other words, the determinant of a product is the product of the deter¬ 
minants. 

Proof. Let AB = C and let C m be the m- th column of C. From the 
definition of the product of matrices, one sees that if X is a column vector, 
then 

AX = Xl A x + x 2 A 2 + x 3 A 3 . 

Apply this remark to each one of the columns of B successively, that is, 
X = B 1 , X = B 2 , and X — B 3 to find the respective columns of C. We 
conclude that 

C m = b\ m A l + b 2m A 2 + b 3m A 3 . 

Therefore 

( 3 3 3 \ 

£ b n A\ £ b j2 A\ £ b n A k ) 

1=1 j=\ k—1 / 

3 3 3 

= £ £ £ b i ,b j2 b k3 D(A\A i ,A h ). 

1=1 j= l fc=i 

Here we have used repeatedly linearity with respect to each column. Any 
term on the right in the sum will be 0 if i = j, or i = k, or j = k. The other 
terms will correspond to a permutation of A 1 , A 2 , A 3 , and there will be 
six such terms. If you write them out, and interchange columns making the 
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appropriate sign change, you will find that the sum is equal to the six-term 
expansion for the determinant of B times the determinant of A, in other 
words 


D(AB ) = D(B)D(A). 


This proves our theorem. 

Observe that if A is invertible and AB = /, then we necessarily have 
D{A) 7^ 0, because according to Theorem 4, 

1 = D(J ) = D(A)D(B). 

The converse is also true, that is: If D(A) ^ 0, then A is invertible. We 
shall discuss it in the next section. 


§6. Inverse of a matrix 

Theorem 5. Let A be a square matrix such that D(A) ^ 0. Then A is 
invertible. 

Let us consider the 2X2 case. Let 


A = 




be a 2 X 2 matrix, and assume that its determinant ad — be ^ 0. We 
wish to find an inverse for A, that is a 2 X 2 matrix 



such that 


AX = XA = I. 


Let us look at the first requirement, AX — I, which, written out in full, 
looks like this: 




Let us look at the first column of AX. We must solve the equations 
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This is a system of two equations in two unknowns, x and z, which we know 
how to solve. Similarly, looking at the second column, we see that we 
must solve a system of two equations in the unknowns y, w, namely 




ay + bw = 0, 
cy + dw = 1. 


Example. Let 


A = 


'2 f 

^4 3, 


We seek a matrix X such that AX = I. We must therefore solve the 
systems of linear equations 


2x + z = 1, 
4x + 3z = 0, 


and 


2 y + w = 0 , 

4y + 3w = 1. 


By the ordinary method of solving two equations in two unknowns, we 
find 


x = 1, z = — 1 and y = — w = 1, 

Thus the matrix 

X 

is such that AX = I. The reader will also verify by direct multiplication 
that XA = I. This solves for the desired inverse. 

The same procedure, of course, works for the general systems (*) and 
(**). Consider (*). Multiply the first equation by d, multiply the second 
equation by b, and subtract. We get 

{ad — bc)x = d , 

whence 

_ d 
X ad — be 

We see that the determinant of A occurs in the denominator. You can 
solve similarly for y, z, w and you will find similar expressions with only 
D(A) in the denominator. This proves Theorem 5 in the 2X2 case. 

The proof in the 3 X 3 case is also done by solving linear equations, but 
we shall omit it. 
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Exercises 

1. Find the inverses of the following matrices. 



2. Write down the general formula for the inverse of a 2 X 2 matrix 




PART THREE 

MAPPINGS FROM VECTORS 

TO VECTORS 



One way in which we use linear algebra is to approximate an arbitrary 
mapping by a linear mapping. The derivative of an arbitrary mapping is in 
a sense the “best” linear approximation to the mapping. This notion is 
discussed in the chapter of this part. When studying parametrized sur¬ 
faces in Chapter XV, it should be useful to have understood the idea of 
linear approximation, since it allows us to define the tangent plane to such 
a surface in a natural way, as the image of the approximating linear map. 

If you are not interested in the idea of the proof behind the implicit 
function theorem, you can work mechanically without knowing anything 
about inverse mappings in order to compute the derivative of an implicit 
function. 



CHAPTER XI 


Applications to Functions of 
Several Variables 


Having acquired the language of linear maps and matrices, we shall be 
able to define the derivative of a mapping, or rather, of a differentiable 
mapping. The theoretical considerations involved in the proof of the 
general chain rule of §3 become of course a little abstract. But you should 
note that it is precisely the availability of the notion of linear mapping 
which allows us to give a statement of the chain rule, and a proof, which 
runs exactly parallel to the proof for functions of one variable, as given 
in the First Course. The analysis profits from algebra, and conversely, 
the algebra of linear mappings finds a neat application which enhances 
its attractiveness. 

§1. The derivative as a linear map 

We shall interpret our notion of differentiability given in Chapter III 
in terms of linear mappings. 

Let U be an open set in R n . Let /be a function defined on U. Let P 
be a point of U, and assume that /is differentiable at P. Then there is a 
vector A, and a function g such that for all small vectors H we can write 

(1) f(P + H)=f(F)+ A- H+ ||tf||g(tf), 
and 

(2) Urn g(H) = 0. 

Iltfll—o 

The vector A , expressed in terms of coordinates, is none other than the 
vector of partial derivatives: 

A = grad f(P) = (DJ(P), .... £„/(/>)). 

We have seen that there is a linear map L = L A such that 

L{H) = AH. 

Our .condition that / is differentiable may therefore be expressed by 
saying that there is a linear map L: R n — > R and a function g defined for 
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sufficiently small H, such that 

(3) f(P + H) - /(/>) + L(H) + ||tf||g(fl) 
and 

lim g(H)= 0. 

II jyll —»o 

In the case of functions of one variable, we have of course the same kind 
of formula, namely 

f(a + h) = f(a) + ch + | h\g(h) 

where 

lim g(h ) = 0. 
h-> o 

Here, a, h are numbers, and so is the ordinary derivative c. But the map 
L c : R — » R such that L c (h ) = ch (multiplication by the number c) is a 
linear map, so that also in this case, we can write 

/(« + h) = f(a) + L c {h) + \h\g(h). 

Up to now, we did not define the notion of derivative for functions of 
several variables. We now define the derivative of/ at P to be this linear 
map, which we shall denote by Df(P ) or also f'(P). This notation is there¬ 
fore entirely similar to the notation used for functions of one variable. 
We could not make the definition*before we knew what a linear map was. 
All the theory developed in Chapters II through VII could be carried out 
knowing only dot products, and this is the reason we postponed making 
the general definition of derivative until now. 

If L is a linear map, then it will be useful to omit some parentheses in 
order to simplify the notation. Thus we shall sometimes write Lv instead 
of L(v). With this convention, we can write (3) in the form 

(4) /(/> + H) = f(P) + Df(P)H + \\H\\g(H), 
or also 

(5) f(P + //) = /(F) + f'(P)H + || H\\g(H). 

These ways of expressing differentiability are those which generalize 
to arbitrary mappings. 

Let U be an open set in R n . Let F: U —> R m be a mapping. Let F be a 
point of U. We shall say that F is differentiable at F if there exists a linear 
map 


in 
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and a mapping G defined for all vectors H sufficiently small, such that we 
have 

(6) F(P + H) = F(P) + LH+ \\H\\G(H) 
and 

(7) lim G(H) = O. 

\\h\\-+o 


If such a linear mapping L exists, then we interpret (6) as saying that 
L approximates F up to an error term whose magnitude is small, near the 
point P. 

A linear map L satisfying conditions (6) and (7) will be said to be 
tangent to F at P. It is also said to be the best linear approximation to 
F at P. 

Just as before, we define a map ^ defined for small H to be o(H ) (“little 
oh of H") if 


lim = O. 


\H 11—^0 


H 


Then we can write our definition of differentiability in the form 

F(P + H) = F(P) + L(H) + o(H), 
where L is a linear map. 

Theorem 1. Suppose that there exist linear maps L, M which are tangent 
to F at P. Then L — M. In other words , if there exists one linear map 
which is tangent to F at P, then there is only one. 

Proof. Suppose that there are two mappings G\, G 2 such that for all 
sufficiently small H, we have 

F(P + H) = F(P) + LH+ \\H\\Gi(H), 

F(P +H)= F(P) + MH+ \\H\\G 2 (H), 

and 

lim GfH) = O, lim G 2 (H ) = O. 

II /rll —>o lli/ll-*o 

We must show that for any vector Y we have LY = MY. Let t range 
over small positive numbers. Then tY is small, and P + tY lies in U. 
Thus F(P + tY) is defined. By hypothesis, we have 

F(P + tY) = F(P) + L(tY) + \\tY\\GftY), 

F{P + tY) = F(P) + M(tY) + \\tY\\G 2 (tY). 
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Subtracting, we obtain 

O = L(tY ) - M{tY) + WtYWiG^tY) - G 2 (tY)]. 

Let G — G i — G 2 . Since L, M are linear, we can write L(tY) = tL(Y ) 
and M(tY) = tM(Y). Consequently, we obtain 

tM(Y) - tL(Y ) = /||y||G(/7). 

Take t 0. Dividing by t yields 

M(Y)~ L(Y)= || Y\\G(tY). 

As t approaches 0, G(t Y) approaches O also. Hence the right-hand side 
of this last equation approaches O. But M(Y) — L(Y) is a fixed vector. 
The only way this is possible is that M(Y) — L(Y) = O, in other words, 
M(Y) = L(Y), as was to be shown. 

If there exists a linear map tangent to F at P , we shall denote this 
linear map by F'(P), or DF(P ) and call it the derivative of F at P. We may 
therefore write 


F(P + H) = F(P) + F'(P)H + \\H\\G(H) 

instead of (6). 

In the next section, we shall see how the linear map F'(P) can be com¬ 
puted, or rather how its matrix can be computed when we deal with vectors 
as n-tuples. 


Exercises 


1. Let /: R —* R be a function, and let a be a number. Assume that there 
exists a linear map L tangent to / at a. Show that 


£( 1) = lim 
/>-+ o 


f(a + h) - f(a ) 
h 


2. Conversely, assume that the limit 

lim -BO. ± *> - 

*->o h 

exists and is equal to a number b. Let L b be the linear map such 
that L b (x) = bx for all numbers x. Show that L b is tangent to /at a. It is 
customary to identify the number b and the linear map L b , and to call either 
one the derivative of/ at a. 

3. Let L: R —> R n be a linear map from the reals into R". Show that there is 
some element v in R ri such that L(x) = xv for all numbers jc. 
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4. Going back to Chapter II, let X{t) be a curve, defined for all numbers t, 
say. Discuss in a manner analogous to Exercises 1 and 2 the derivative 
dX/dt, and the linear map L t : R —* R" which is tangent to X at t. 


§ 2 . The Jacobian matrix 


Throughout this section, all our vectors will be vertical vectors. We 
let Di, . . ., D n be the usual partial derivatives. Thus = d/dxi. 

Let F: R n —> R m be a mapping. We can represent F by coordinate 
functions. In other words, there exist functions f u . .. ,f m such that 

//iW\ 

F(X) = [ MX) | = Xm*), • • •, fmW). 

To simplify the typography, we shall sometimes write a vertical vector 
as the transpose of a horizontal vector, as we have just done. 

We view A" as a column vector, X = *(xi,..., x n ). 

Let us assume that the partial derivatives of each function 
(i = 1 ,... ,m) exist. We can then form the matrix of partial derivatives: 


(df\ _ 

\dXjJ 



/dfi 

d A 

dfi 


/ dx t 

dx 2 

dX n 


df 2 

df 2 

df 2 


dXi 

dx 2 

dx n 

• 


[ df m 

dfm 

dfm 


\dx t 

dx 2 

dX n 

, m and 

j = 1,. • • 

, n. 




This matrix is called the Jacobian 
matrix of F, and is denoted by Jf(X). 

In the case of two variables (x, y), say F is given by functions (/, g), 
so that 

y ) = (fix, y), g(x, >0), 

then Jacobian matrix is 

V df y 

dx dy 


J F (x, y) = 


dg_ dg 
<dx dy, 


(As we have done just now, we sometimes write the vectors horizontally, 
although to be strictly correct, they should be written vertically.) 
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Example 1. Let F: R 2 —> R 2 be the mapping defined by 


F(x, y ) = 


_ (x 2 + y 2 \ (f(x, ;p)\ 




<g(x, y)/ 


Find the Jacobian matrix J F (P ) for P = (1,1). 

The Jacobian matrix at an arbitrary point ( x, y) is 



Hence when jc = 1, y = 1, we find: 

WU) = g J- 

Example 2. Let F: R 2 —> R 3 be the mapping defined by 

' xy 

F(x, y) — I sin x 


Find J F (P ) at the point P = (tt, 7t/2). 

The Jacobian matrix at an arbitrary point (x, y) is 


Hence 



C 


Theorem 2. Let U be an open set in R”. Let F: V —> R™ be a mapping, 
having coordinate functions f\, . . . ,f m . Assume that each function f is 
differentiable at a point X of U. Then F is differentiable at X, and the 
matrix representing the linear map DF(X) = F'(X) is the Jacobian 
matrix Jf(X). 

Proof For each integer i between 1 and n-^there is a function such 
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that 

ii lip gi(H) = 0, 

and such that we can write 

MX + H) = MX o + grad f t (X)-H + ||H||g,-(//). 

We view X and F(X) as vertical vectors. By definition, we can then write 
F(X + H)= ‘(MX + H),.. . ,f m (X + H)). 

Hence 

//i(*)\ /grad f x (X) • H\ /gi(H)\ 

F{X+H) = \ i + i + \\H\\[ : • 

\fm(X)/ \grad f m (X) • HI \g m (H)J 

The term in the middle, involving the gradients, is precisely equal to the 
product of the Jacobian matrix, times H, i.e. to 

J f (X)H. 

Let G(H ) = l {gi(H), . . ., g m (H)) be the vector on the right. Then 

F(X + H) = F(X) + J f (X)H + ||tf ||G(tf). 

As 'll/f|| approaches 0, each coordinate of G(H) approaches 0. Hence 
G(H ) approaches O ; in other words, 

, " . lip G(H) = O. 

Hence the linear map represented by the matrix J F (X ) is tangent to F 
at X. Since such a linear map is unique, we have proved our theorem. 

Let U be open in R n and F: U —> R n be a differentiable map into the 
'Same dimensional space. Then the Jacobian matrix J F (X) is a square 
matrix, and its determinant is called the Jacobian determinant of F at X. 
We denote it by 

AH*)- 

Example 3. Let Fbe as in Example 2, F(x, y) = (x 2 + y 2 , e xy ). Then 
the Jacobian determinant is equal to 

AH*, y) = 2X 2y = 2x 2 e y - 2 y 2 e x . 

x y 

ye xe u 

In particular, 

AHU 1) = 2e — 2e = 0, 

Af(l, 2) = 2e 2 - %e. 
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Example 4. An important map is given by the polar coordinates, 
F: R 2 —> R 2 such that F(r, 0) = (r cos 0, r sin 0). We can view the map 
as defined on all of R 2 , although when selecting polar coordinates, we 
take r ^ 0. We see that F maps a rectangle into a circular sector (Fig. 1). 



Figure 1 


It is easy to compute the Jacobian matrix and determinant. Do this as 
Exercise 7. 

Exercises 

% 

1. In each of the following cases, compute the Jacobian matrix of F. 

(a) F(x, y) = (x + y, x 2 y ) (b) F(x, y) = (sin *, cos xy) 

(c) F(x, y) = (e xy , log x) (d) F(x, y, z) = (xz, xy, yz) 

(e) F(x,y,z) = (xyz, x 2 z) (f) F(x, y, z) = (sin xyz, xz) 

2. Find the Jacobian matrix of the mappings in Exercise 1 evaluated at the 
following points. 

(a) (1,2) (b)(7r,7r/2) (c)(l,4) 

(d) (1,1, -1) (e) (2, -1, -1) (f) Or, 2, 4) 

3. Let L: R" —» R TO be a linear map. Show that for each point X of R" we 
have L'(X) = L. 

4. Find the Jacobian matrix of the following maps. 

(a) F(x, y) = (xy, x 2 ) (b) F(x, y, z) = (cos xy, sin xy, xz) 

5. Find the Jacobian determinant of the map in Exercise 1(a). Determine all 
points where the Jacobian determinant is equal to 0. 

6. Find the Jacobian determinant of the map in Exerc\se 1(b). 

7. Let F: R 2 —» R 2 be the map defined by 

F(r, 0) = (r cos 0, r sin 0), 
in other words the polar coordinates map 

x = r cos 0, y = r sin 0. 

Find the Jacobian matrix and Jacobian determinant of this mapping. 
Determine all points (r, 0) where the Jacobian determinant vanishes. 

8. Let F: R 3 —» R 3 be the mapping defined by 

F(r, 0, <p) = (r sin <p cos 0, r sin <p sin 0, r cos ip) 
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or in other words 

x = r sin <p cos 0, y = r sin <p sin 0, z — r cos <p. 

Find the Jacobian matrix and Jacobian determinant of this mapping. 

9. Find the Jacobian matrix and determinant of the map 

F{r, 0) = ( e T cos 0, e T sin 0). 

Show that the Jacobian determinant is never 0. Show that there exist two 
distinct points (n, 0i) and (r 2 , 0 2 ) such that 

FOi, 0i) = F(r 2 , 0 2 ). 


§3. The chain rule 

In the First Course, we proved a chain rule for composite functions. 
Earlier in this book, a chain rule was given for a composite of a function 
and a map defined for real numbers, but having values in R n . In this 
section, we give a general formulation of the chain rule for arbitrary 
compositions of mappings. 

Let U be an open set in R n , and let V be an open set in R m . Let 
F: U —> R m be a mapping, and assume that all values of F are contained 
in V. Let G: V —> R 8 be a mapping: Then we can form the composite 
mapping G°Ffrom U into R 8 . 

Let X be a point of U. Then F(X) is a point of V by assumption. Let 
us assume that F is differentiable at X, and that G is differentiable at 
F(X). We know that F'(X) is a linear map from R" into R m , and G'(F(A’)) 
is a linear map from R m into R 8 . Thus we may compose these two linear 
maps to give a linear map G'(F(X))°F'(X) from R n into R 8 . 



The next theorem tells us what the derivative of G°F is in terms of 
the derivative of F at X, and the derivative of G at F(X). Please observe 
how the statement and proof of the theorem will be entirely parallel to 
the statement and proof of the theorem for the chain rule in the First 
Course. 
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Theorem 3. Let U be an open set in R", let V be an open set in R m . 
Let F: U —» R TO be a mapping such that all values of F are contained in V. 
Let G: K —> R s be a mapping. Let X be a point of U such that F is differ¬ 
entiable at X. Assume that G is differentiable at F(X). Then the com¬ 
posite mapping G°F is differentiable at X, and its derivative is given by 

(G°F)'(X) = G'{F(X))o F'(X). 

Proof By definition of differentiability, there exists a mapping x 
such that 

lim $i(/0 = O 

II/TII-.0 

and 

F(X + H) = F{X) + F’(X)H + ||//||<I> 1 (//). 

Similarly, there exists a mapping <3> 2 such that 

lim $ 2(^0 — O, 

and 

G(Y + K) = G(T) + G f (Y)K + \\K\\<i> 2 (K). 

We let K = K(H) be 

K = F(X + H) - F(X) = F'(X)H + ||//||<t> fHf. 

Then 

G(F(X + H )) = G(F(X) + K) 

= G(F(X)) + G'(F(X))K + o(K). 

Using the fact that G'{F(X )) is linear, and ) ) 

K = F(X + H)- F(X) = F'(X)H + \\H\\<$> X (H), 
we can write 

(< GoF)(X+ H) = ( G°F)(X ) + G'(F(X))F'(X)H 

+ \\H\\G'(F(X) )$!(//) + o(K). 

Using simple estimates which we do not give in detail, we conclude that 
(GoF)(X+ H) = (GoF)(X) + G'(F{X))F’{X)H + o(H). 

This proves that the linear map 

G'(F(X))F'(X ) 

is tangent to Go F at X. It must therefore be equal to ( G°F)'(X ), as was 
to be shown. 
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§4. Inverse mappings and implicit functions 

Let U be open in R n and let F: U —* R n be a map, given by coordinate 
functions: 

F(X)= 

If all the partial derivatives of all functions /, exist and are continuous, 
we say that F is a C^-map. We say that F is CMnvertible on U if the image 
F(U) is an open set V, and if there exists a C^-map G: V —> U such that 
Go F and F° G are the respective identity mappings on U and V. 



Example 1. Let A be a fixed vector, and let F: R n —> R” be the transla¬ 
tion by A, namely F(X) = X -T A. Then F is CMnvertible, its inverse 
being translation by —A. 

Example 2. Let U be the subset of R 2 consisting of all pairs (/*, 9) with 
r > 0 and 0 < 6 < it. Let 


F(r, 9) = (r cos 9, r sin 9). 

Let x = r cos 0 and y = r sin 9. Then the image of U is the upper half¬ 
plane consisting of all (x, such that y > 0, and arbitrary x (Fig. 4). 



Figure 4 


We can solve for the inverse map G, namely: 


r = 


X 2 y 2 anC | 


9 = arccos — 
r 
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so that 

G(x, y) = y 2 , arccos • 

In many applications, a map is not necessarily invertible, but has still a 
useful property locally. Let P be a point of U. We say that F is locally 
C 1 -invertible at P if there exists an open set U\ contained in U and con¬ 
taining P such that F is Convertible on U\. 

Example 3. If we view F(r, 0) = (rcos 0, rsin 0) as defined on all of 
R 2 , then F is not CMnvertible on all of R 2 , but given any point, it is 
locally invertible at that point. One could see this by giving an explicit 
inverse map as we did in Example 2. At any rate, from Example 2, we 
see that F is C 1 -invertible on the set r > 0 and 0 < 0 < tt. 

In most cases, it is not possible to define an inverse map by explicit 
formulas. However, there is a very important theorem which allows us 
to conclude that a map is locally invertible at a point. 

Inverse Mapping Theorem. Let F: U —> R n be a C l -map, Let P be a 

point of U. If the Jacobian determinant A p(P) is not equal to 0, then F 

is locally C 1 -invertible at P. 

A proof of this theorem is too involved to be given in this book. How¬ 
ever, we make the following comment. The fact that the determinant 
A f(P) is not 0 implies (and in fact is equivalent with) the fact that the 
Jacobian matrix is invertible, and the Jacobian matrix represents the linear 
map F'(P). Thus the inverse mapping theorem asserts that if the derivative 
F'(P) is invertible, then the map F itself is locally invertible at P. Since 
it is usually very easy to determine whether the Jacobiaiudeterminant 
vanishes or not, we see that the inverse mapping theorem gives us a 
simple criterion for local invertibility. 

Example 4. Consider the case of one variable, y = f{x). In the First 
Course , we proved that if/'(x 0 ) 0 at a point * 0 , then there is an inverse 
function defined near y Q = f(x 0 ). Indeed, say f'(x 0 ) > 0. By continuity, 
assuming that/' is continuous (i.e./is C 1 ), we know that f(x\ > 0 for x 
close to x 0 . Hence f is strictly increasing, and an inverse function exists 
near * 0 - In fact, we determined the derivative. If g is the inverse function, 
then we proved that 

g'(yo) = /'(* o) -1 . 

Example 5. The formula for the derivative of the inverse function in 
the case of one variable can be generalized to the case of the inverse 
mapping theorem. Suppose that the map F: U —> V has a CMnverse 
G: V —> U. Let X be a point of U. Then G°F = I is the identity, and 
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since / is linear, we see directly from the definition of the derivative that 
I'(X ) = /. Using the chain rule, we find that 

/ = (G°FY(X) = G’(F(X))oF'(X ) 

for all X in U. In particular, this means that if Y = F(X), then 


G\Y ) = F'(X)- 1 


where the inverse in this last expression is to be understood as the inverse 
of the linear map F'(X). Thus we have generalized the formula for the 
derivative of an inverse function. 


Example 6. Let F(x, y) = ( e x cos y, e x sin y). Show that F is locally 
invertible at every point. 

We find that 


Jf(x, y) = ( e cos ^ e s * n y \ , whence Af(*, y) = e x 0. 

\e x sin y e x cos yj 

Since the Jacobian determinant is not 0, it follows that F is locally in¬ 
vertible at (x, y) for all x, y. 

Example 7. Let U be open in R 2 and let f:U—>Rbea C l -function. 
Let ( a , b ) be a point of U. Assume that D 2 f(a, b ) 0. Then the map F 
given by 

O, y) F(x, y) = (x, f(x, >>)) 
is locally invertible at ( a , b). 


Proof All we have to do is compute the Jacobian matrix and deter¬ 
minant. We have 

/! O' 

Jf(x, y) = 


so that 


df df 
,dx dy/ 


J P ia, b) = 


1 


0 


<Dif(a, b) D 2 f(a,b)/ 


and hence 

A F (a, b) = D 2 f{a, b). 

By assumption, this is not 0, and the inverse mapping theorem implies 
what we want. 

The result of Example 7 can be used to discuss implicit functions. 
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Again let /: U —> R be as in Example 7, and assume that f(a,b ) = c. 
We ask whether there is some differentiable function y = <p(.x) defined 
near x = a such that <p(a) = b and 

f(x, <p(x)) = c 

for all x near a. If such a function <p exists, then we say that y = <p(x) 

is the function determined implicitly by /. 

Theorem 4 (Implicit Function Theorem ). Let U be open in R 2 and let 
/: U —> R be a C l -function. Let (a , b ) be a point ofU, and let f(a, b ) = c. 
Assume that D 2 f(a, b) ^ 0. Then there exists an implicit function 
y = <p(x) which is C 1 in some interval containing a, and such that 
<f(a) = b. 


Proof. We apply Example 7 and use the notation of that example. 
Thus we let 

F(x, y ) = (*, f(x, y)). 

We know that F(a, b ) = (a, c) and that there exists a CMnverse G 
defined locally near ( a , c). The inverse map G has two coordinate func¬ 
tions, and we can write G(x, z ) = (x, g(x, z)) for some function g. Thus 
we put y = g(x, z), and z = f(x, y). We define 


Then on the one hand, 


<p(x) = g(x, c). 



F(x, <p(x)) = F(x,g(x, c)) = F(G(x, c>) = (x, c). 


and on the other hand, 

F(x, <p(x)) = ( x,f(x, <p(xj)). 

This proves that f(x, <p(x)) = c. Furthermore, by definition of an inverse 
map, G{a, c) = (a, b) so that <p(a) = b. This proves the implicit function 
theorem. 

Example 8. Let f(x, y) = x 2 + y 2 and let (a, b) = (1, 1). Then 
c = /(1, 1) = 2. We have D 2 f(x, y) = 2 y so that 

D 2 f( 1, 1) = 2 ^ 0, 

so the implicit function y = ip(x) near x = 1 exists. In this case, we can 
of course solve explicitly for y, namely 

y = x/2 — x 2 . 

Example 9. We take f(x, y) = x 2 + y 2 as in Example 8, and 
(a , b) = (— 1, — 1). Then again c = /(— 1,-1) ~ 2, and 

D 2 f(~ 1, -1) = -2^0. 
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In this case we can still solve for y in terms of x, namely 

y = - \/2 - * 2 . 

In general, the equation f(x,y ) = c defines some curve as in the 
following picture. 



Figure 5 


Near the point ( a , b ) as indicated in the picture, we se6 that there is an 
implicit function: 



Figure 6 


but that one could not define the implicit function for all a, only for those 
x near a. 


Example 10. Let f(x,y) = x 2 y + 3y 3 jc 4 — 4. Take (a, b ) = (1, 1) 
so that /(a, b) = 0. Then D 2 f(x, y) = x 2 + 9 y 2 x 4 and 

D 2 f( 1, 1) = 10 ^ 0. 

Hence the implicit function y = <p(x) exists, but there is no simple way 
to solve for it. We can also determine the derivative <p'(l). Indeed, differ¬ 
entiating the equation f(x, y) = 0, knowing that y = <p(x) is a differentia¬ 
ble function, we find 

2xy + xV + I2y 3 x 3 + 9j; 2 /x 4 = 0, 


whence we can solve for y' = <p'(x), namely 


<p'(x ) = y' 


2 xy + 12 y 3 x 3 


^(D 


2+12 

1 + 9 


Hence 


7 

5 
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In Exercise 4 we give the general formula for an arbitrary function /. 

Example 11. In general, given any function f(x, y) = 0 and y = v(x) 
we can find <p'(x) by differentiating in the usual way. For instance, suppose 

x 2 + 4 y sin(xv) = 0. 

Then taking the derivative with respect to x, we find 

3x 2 + Ay' sin(x^) + Ay cos(.xp)(j> + xy'). 

We then solve for y' as 

, __ Ay 2 cos(xy) + 3x 2 

y A sin(xj>) + Axy cos(xy) 

whenever 4 sin(xy) + Axy cos(x^) 0. Similarly, we can solve for y" 
by differentiating either of the last two expressions. In the present case, 
this gets complicated. 

Exercises 


1. Determine whether the following mappings are locally C ^invertible at the 
given point. 

(a) F(x, y) = (x 2 - y 2 , 2xy) at (x, y) ^ (0, 0) 

(b) F(x, y) = (x*y + 1, x + y 2 ) at (1, 2) 

(c) F(x,y ) = (x + p,p 1/4 ) at (1, 16) 

(d) Hx, y) = > ^75) at <*• * «>■ 

(e) F(x, y) = U + x 2 + y, x + y 2 ) at (x, y) = (5, 8) 

2. Determine whether the following mappings are locally C 1 —invertible at 
the indicated point. 

(a) F(x, y) = (x + y, x 2 y ) at (1, 2) 

(b) F(x, p) = (sin x, cos xy) at (ir, ir/2) 

(c) F(x, f) = (e xy , log x) at (1,4) 

(d) F(x, f, z) = ( xz , xy, yz ) at (1, 1, - 1) 

3. Show that the map defined by F(x, f) = ( e* cos y, e x sin y) is not invertible 
on all of R 2 , even though it is locally invertible everywhere. 

4. Let f = <p(x) be an implicit function satisfying /(*, (p(x)) = 0, both /, <p 
being C 1 . Show that 

j (r -. _ fli/o,g(*)) 

v yx) D 2 /(x, pW) 

wherever D 2 f(x,tp(x)) ^ 0. 

5. Find an expression for <p"(x) by differentiating the preceding expression 
for <p'(x). 

6. Let J\x,y) = (x — 2) 3 f + xe y ~ l . Is D 2 f(a,b ) ^ 0 at the following 
points (a, b) ? 

(a) (1,1) (b) (0, 0) (c) (2, 1) 
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7. Let / be a CLfunction of 3 variables (x, p , z) defined on an open set U of 
R 3 . Let (a, b, c) be a point of U, and assume/(a, b, c) = 0, Dzfia, b, c) ^ 0. 
Show that there exists a C^-function <p(x, p) defined near (a, b ) such that 

f(x, y, <p(x, p>) = 0 and <p(a, b) = c. 


We call (p the implicit function z — <p(x, p) determined by /at (a, b ). 

8. In Exercise 7, show that 


Di<p(a, b ) 


Z>i/(a, b, c ) 
L> 3 /(<3, A, c) 


9. For each of the following functions /, show that f(x, p) = 0 defines an 
implicit function p = y(x) at the given point ( a , b ), and find <p'(a). 

(a) f(x,y) = x 2 - xy + p 2 - 3 at (1, 2) 

(b) f{x t y) = x cos xp at (1, tt/2) 

(c) f(x,y) = 2e x+y - x + p at (1, —1) 

(d) /(*, p) = xev - p + 1 at (-1, 0) 

(e) f(x, p) = x + p + * sin p at (0, 0) 

(f) fix, y) = x 5 + p 5 + xy -f 4 at (2, -2) 

10. For each of the following functions f(x,y,z), show that J\x,y,z ) = 0 
defines an implicit function z = <pix, p) at the given point (a, b, c ) and find 
Dupia, b ) and Z) 2 ^(a, A). 

(a) /(*, p, z) = x + p + z + cos xpz at (0, 0, -1) 

(b) fix, y, z) — z z — z — xy sin z at (0, 0, 0) 

(c) fix, y, z) = * 3 + p 3 + z 3 - 3xpz - 4 at (1, 1, 2) 

(d) /(x, p, z) = x + p + z — e xyz at (0, 

11. Let J\x,y,z ) — x 3 — 2p 2 + z 2 . Show that fix, y, z) = 0 defines an 
implicit function x = <^(p, z) at the point (1,1, 1). Find D\<p and at 
the point (1,1). 

12. In Exercise 10, show that fix, p, z) = 0 also determines p as an implicit 
function of (x, z) and z as an implicit function of (x, p) at the given point. 
Find the partial derivatives of these implicit functions at the given point. 




PART FOUR 

MULTIPLE INTEGRATION 



The theory of integration has been separated logically into two parts, 
for the convenience of those using the text in different circumstances. 

For those who use the text only one term, or who wish to deal with 
multiple integration early, Chapter XII gives the basic techniques of 
multiple integrals, and is independent of the linear algebra or deter¬ 
minants. It can be read immediately after the chapter on vectors, i.e. 
after Chapter I. 

Similarly, the first section on Green’s theorem can be read after know¬ 
ing about curve integrals and double integrals. It is independent of 
Chapter XIII and of the linear algebra. It provides a good application 
at a quite elementary level for both curve integrals and double integrals, 
by showing the relation between the two. 

Chapter XV, the last, is independent of the change of variables formula. 
It requires essentially only multiple integration, and the algebra of vectors. 
The section on Jacobian matrices is used incidentally, to give more geo¬ 
metric motivation to tangent planes and area. 

Chapter XIII on the change of variables formula is the most expand¬ 
able for a class pressed for time. It uses determinants. If there is time, it 
can be used to give a more direct proof for the value of the volume of an 
elementary spherical region, computed ad hoc in Chapter XII. 



CHAPTER XII 


Multiple Integrals 


When studying functions of one variable, it was possible to give essen¬ 
tially complete proofs for the existence of an integral of a continuous 
function over an interval. The investigation of the integral involved lower 
sums and upper sums. 

In order to develop a theory of integration for functions of several 
variables, it becomes necessary to have techniques whose degree of sophis¬ 
tication is somewhat greater than that which is available to us. Hence we 
shall only state results, and omit most of the proofs, except in special 
cases. These results will allow us to compute multiple integrals. 

Even in these special cases, the proofs should be omitted in any class which 
is not very hip on theory. 

We shall also list various formulas giving double and triple integrals 
in terms of polar coordinates, and we give a geometric argument to make 
them plausible. Here again, the general formula for changing variables 
in a multiple integral can be handled theoretically (and elegantly) only 
when much more machinery is available than we have at present. The 
proofs properly belong to an advanced calculus course. [Cf. Introduction 
to Analysis .] 


§1. Double integrals 

We begin by discussing the analogue of upper and lower sums associ¬ 
ated with partitions. 

Let R be a region of the plane (Fig. 1), and let/be a function defined on 
R. We shall say that / is bounded if there exists a number M such that 
]f(X)\ ^ M for all X in R. 

d 


c+ 



-i-1- 

a b Figure 1 
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Let a, b be two numbers with a ^ b, and let c, d be two numbers with 
c g, d. We consider the closed interval [a, b] on the x-axis and the closed 
interval [c, d] on the y-axis. These determine a rectangle R in the plane, 
consisting of all pairs of points (x, y) with a g x g b and c g y g d. 
The rectangle R above will be denoted by \a, b] X [c, d\. 

Let / denote the interval [a, b]. By a partition Pi of I we mean 
a sequence of numbers 

Xi = a g X 2 ^ ^ x m = b 

which we also write as Pi = (xi, ..., x m ). Similarly, by a partition Pj 
of the interval J = [c, d] we mean a sequence of numbers 

y\ = c ^ y 2 ^ ^ y n = d 

which we write as Pj = (yi, ..., y n )• 

Each pair of small intervals [x l5 x, + 1 ] and [yj, yj+i] determines a 
rectangle 

Sij = [xi, Xj.|_i] x yj-\- 1]• 

(Cf. Fig. 2(a).) We denote symbolically by P = P T X Pj the partition 
of R into rectangles S{j and we call such Sij a subrectangle of the partition 
(Fig. 2(b)). 



y n = d 


y i 
0 


Figure 2 


a xi ‘ ' • Xfn ,— i Xm — b 


(b) 


If R is a rectangle as above, we define its 2-dimensional volume (that is, 
its area) to be the obvious thing, namely 

Area(jR) = (d — c)(b — a). 

Thus the area of each subrectangle Sij is (y/+i — yj)(xi +1 — x t ). 

Let A be a region in the plane, and let/be a function defined on A. As 
usual, we say that/ is continuous at a point P of A if 

lim f(X) = f(P). 
x-*p 

We say that/ is continuous on A if it is continuous at every point of A. 
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If S is a set and/ a function on S which reaches a maximum on S, we let 

msLXsf 

denote this maximum value. It is a value f(v) for some point v in S such 
that f(p) ^ /(w) for all w in S. Similarly, we let 

min s / 

be the minimum value of the function on S, if it exists. We recall a fact 
which we do not prove, that a continuous function on a closed and bounded 
set always takes on a maximum and minimum value. For instance, a 
continuous function on a closed interval [a, b] always has a maximum. A 
continuous function on a rectangle as above also has a maximum, and a 
minimum. 

We then form sums which are analogous to the lower and upper sums 
used to define the integral of functions of one variable. If P denotes the 
partition as above, and/ is a continuous function on R, we define 

L(P, f) = X (mins /) Area(jS), 

U(P , /) = 2 (maxs /) Area(S). 

aS 

The symbol means that we must take the sum over all subrectangles 
of the partition. In terms of the indices i, j, we can rewrite say the lower 
sum as 

m n 

L(P, /) = E E (mins,.,. /)Oj+i - J>i)(*i+i - *t) 

1=1 j= 1 

= EE (mi ns,.,./) Area(Siy), 

i j 

and similarly for the upper sum. 

Let Vij be a point in the small rectangle S{j such that f(pij ) is a maximum 
of/ on this rectangle. Then the upper sum U(P, /) can be written also in 
the form 

m n 

u(p, /) = E E f(Pij)(yi+\ - yjXxi+i - *d- 

i= 1 j—\ 

= H Z) f(Pii) Area (S’,-,-) 

i j 

% 

If /(pa) is neither a maximum nor a minimum for / on then the 
above sum lies between the upper and lower sum, and is called a 

Riemann sum for /. 

Just as in the case of functions of one variable, we can then take refine¬ 
ments of partitions. If Pj is a partition of /, we say that Pj is a refinement 
of Pi if every number of Pj is among the numbers of Pj. If Pj is a refine¬ 
ment of Pj , then we call Pj X Pj = P' a refinement of P. 
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We omit the proof of the following lemma, which is entirely similar to 
the one variable case. 

Lemma. If P' is a refinement of P, then 

L(P ,/) ^ L(P',f) ^ U(P',f) ^ U(P,f). 

In other words, the lower sums increase under refinements of the parti¬ 
tion, while the upper sums decrease. 

We define/ to be integrable on R if there exists a unique number which 
is greater than or equal to every lower sum, and less than or equal to 
every upper sum. Formulated in another way, we can say that / is inte¬ 
grable on R if and only if the least upper bound of all lower sums is equal 
to the greatest lower bound of all upper sums. If this number exists, we 
call it the integral of/, and denote it by 


J f or fff( x > y) d y dx - 

R 

We can interpret the integral as a volume under certain conditions. 
Namely, suppose that f(x, y) ^ 0 for all (x, y) in R. The value f(x, y) 
may be viewed as a height above the point (x, ^), and we may consider 
the integral of / as the volume of the 3-dimensional region lying above 
the rectangle R and bounded from above by the graph of/ (Fig. 3). 



Each term 


(mins /) Area (S) 


is the volume of a rectangular box whose base is the rectangle S in the 
(x, y)-plane, and whose height is mins /• The volume of such a box is 
precisely (mins/) Area(S), where, as we said above, Area(£) is the 2- 
dimensional volume of S, that is its area. 1 his box lies below the 3-dimen- 
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sional region bounded from above by the graph of/. Similarly, the term 

(maxs /) Area(S) 

is the volume of a box whose base is S and whose height is max#/. This 
box lies above the above region. This makes our interpretation of the 
integral as volume clear. 

Also, as in one variable, a positive function on a region may be viewed 
as a density, and thus if/ ^ 0 on R, then we also interpret 

f f f(x, 7) dy dx 

R 

as the mass of R. 

Theorem 1. Assume that f, g are functions on the rectangle R, and are 
integrable. Then f + g is integrable. If k is a number, then kf is integrable. 
We have: 


[ (/+ g) = f /+ f g and f (kf) = kf f 
Jr Jr Jr Jr Jr 

In other words, integrable functions form a vector space, and the integral 
is a linear map on this vector space. 

Proof. Let P be a partition of R and let S be a subrectangle of the 
partition. For any point v in S we ha\< 

min s f ^ f(v) and mins g ^ g(v), 

whence 

min s / + min s g ^ f(f) + g(p). 


Thus mins/ + mins g is a lower bound for all values f(v) + g(v). Hence 
by definition of a greatest lower bound, we obtain the inequality 

min s f+ mins g ^ min (f(v) + g(v)) = min# (/ + g). 

vES 

Consequently we get 


L(P, f) + L(P, g) = 2 ( min s /) Area(S) + £ (mins g) Area(S) 

s s 

= (min s / + mins g) Area(S) 

s 

S £ mi ns (/ + g) Area(S) 

s 

= L(P, f + g). 
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By a similar argument, we find that 

L(P,f) + L(P, g) ^ L(P,f+ g ) ^ U(P,f+ g ) :g U(P,f) + U(P,g). 

Since L(P,f) and U(P,f) are arbitrarily close together for suitable parti¬ 
tions, and L(P, g), U(P, g) are arbitrarily close together for suitable 
partitions, we see by the usual squeezing process of limits that L(P,f + g) 
and U(P,f + g ) are arbitrarily close together for suitable partitions. This 
proves that/ g is integrable. 

As for the constant k, we note that 

mins ( kf ) = min ( kf(v )) = k • min f(v). 

■vG$ v€zS 

Hence k comes out as a factor in each term of the lower sum, and simi¬ 
larly for the upper sum, so that 

kL(P,f) = L(P, kf) ^ U(P, kf) = kU(P,f). 

From this our second assertion follows. 

Theorem 2. If f g are integrable on R, andf S g, then 



Proof. We have for each subrectangle S of a partition P: 

min s f ^ f(v) ^ g(v) 

for all v in S. Hence mins / is a lower bound for the values of g on S, and 
hence 

mins/ = minsg. 

Consequently 

L{PJ) = 2 (mins/) Area(S) ^ £ (mins g) Area(S) = L(P, g) ^ f g. 
s s Jr 

Since J#g is an upper bound for L(P, f) it follows that the least upper 
bound of all lower sums for fis^j R g, in other words 



as was to be shown. 

Theorem 3c. Let R be a rectangle, and let f be a function defined and 
continuous on R. Then f is integrable on R. 

We shall not give the proof of Theorem 3c. 

We need a somewhat more general discussion to deal with applications 
which arise naturally in practice. A function / is usually not given on a 
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rectangle but on some region A in the plane. We say that A is bounded if 
there exists a number M such that \\X\\ ^ M for all points Z in A. Any 
bounded region is contained in a rectangle, as shown on Fig. 4. 



Figure 4 


The set of boundary points of the region A will be called the boundary of 
A. We shall say that the boundary is smooth if it consists of a finite number 
of smooth curves. A smooth curve means a C 1 curve, i.e. a curve para¬ 
metrized such that the coordinate functions have continuous derivatives, 
as studied in Chapter II. The boundary of A in Fig. 4 consists of three such 
curves. We draw a finite number of C 1 curves in the next picture. 



Figure 5 


Suppose the function /is defined on a region A as in Fig. 4, so that A is 
bounded and has a boundary which is smooth. If we want to integrate / 
over the region A, then it is natural to extend the definition of / to the 
whole rectangle R, by letting 


m = o 

for every point v in R such that v does not lie in A . Then even if we 
assume that / is continuous on A, we see that /is not continuous on R. 
The points of discontinuity are precisely the points of the boundary of A. 
Therefore we cannot apply Theorem 3c directly, and we need a minor 
adjustment of our definitions to deal with this case, which we now discuss. 

Suppose that instead of being continuous on R the function/is merely 
bounded, and so has a least upper bound and a greatest lower bound. 
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Let P be a partition of R, and let 5 be a subrectangle of the partition. By 

lub s f = lub f(v) 

ves 

we mean the least upper bound of all values f(v ) for v in S. We take it as 
a known property of the real numbers that any bounded set of numbers 
has a least upper bound, and also a greatest lower bound. Similarly, we 
denote by 

gib s f = gib f(p) 

v es 


the greatest lower bound of all values of/ on S. We may then form upper 
and lower sums with the least upper bound and greatest lower bound, 
respectively, that is: 

U(P, /) = E (lub,S' /) Area(S) 

s 

and 

UP, /) = E (gibs /) Area(S). 

s 

Then Theorems 1 and 2 hold for this more general type of function, and 
the proofs are the same, replacing max by lub and min by gib. We also 
have an extension of Theorem 3c which applies to all practical cases which 
we shall meet. 

Theorem 3. Let R be a rectangle and let f be a function defined on R, 
bounded, and continuous except possibly at the points lying on a finite 
number of smooth curves. Then f is integrable on R. 

Again, we shall not prove Theorem 3. However, we make some com¬ 
ments to indicate the main idea in the proof. 

Suppose we are interested in the area of the region A, contained in the 
rectangle R as in Fig. 4. Let us partition the rectangle into small rectangles 
Sn as before. Let/ be the function which takes on the value 1 in A and 
has the value 0 at any point not in A. Let be a point in Sjj. Let us 
consider an approximating sum 

X) fiPifi • Area(5{j). 

ij 

If the rectangle Sij lies entirely within the region A, then fivij) = 1 and 
the above sum has a term contributing the area of Sij. If the rectangle Sij 
lies entirely outside the region A, then f(pij) = 0, and the corresponding 
term in the sum is equal to 0. Therefore the terms in the sum which may 
or may not give a positive contribution are those such that Sij touches the 
boundary of A. Suppose that the diameter of each rectangle Sij is small, 
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say at most €. We can achieve this by taking a very fine partition of the 
rectangle. Let L be the length of the boundary. Then the contribution to 
the above sum arising from those terms meeting the boundary of A will 
be approximately equal to eL, and therefore will tend to 0 as e tends to 0. 
This means that if we make the partition very fine, we get a good approx¬ 
imation to the area of A by means of the above sum. A similar argument 
applies when we deal with a more general function/. 

Let A be a region in the plane, contained in a rectangle R (Fig. 4). 
Let /be a function defined on A. We denote by f A the function which has 
the same values as/at points of A, and such that f A (Q) — 0 if Q is a point 
not in A. Then f A is defined on the rectangle R, and we define 

//= / /a 

provided that f A is integrable. By Theorem 3, we note that if the boundary 
of A is smooth, and if /is continuous on A, then f A is continuous except 
at all points lying on the boundary of A, and hence f A is integrable. 

We now have one more property of the integral which is convenient to 
integrate a function over several regions. 

Theorem 4. Let A be a bounded region in the plane, expressed as a 
union of two regions A i and A 2 having no points in common except 
possibly boundary points. Assume that the boundaries of A, A\, A 2 are 
smooth. If f is a function defined on A and continuous except at a finite 
number of smooth curves, then 

[ f= [ /+ f / 

JA J Ay Ja 2 

Furthermore , if A is itself some smooth curve, contained in a rectangle 
R, and if f is a bounded function on R which has the value 0 except 
possibly for points of A, then 



/= 0 . 


We shall not give the proof of Theorem 4, which anyhow is intuitively 
clear. In Fig. 6(a) we have drawn a smooth curve in R where/may not be 
0, and such that f(v) = 0 if v lies in R but v is not a point of A. Then 



/= o. 


This is reasonable because the 2-dimensional area of a curve is 0. In Fig. 
6(b) we have drawn three regions A lt A 2 , A 3 which have only smooth 
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curves in common. The integral of a function / over the three regions is 
then the sum of the integrals of / over each region separately. 



Exercises 

1. Let / be a continuous bounded function on a rectangle R. Let M be a 
number such that |/(r>)| ^ M for all v in R. Show that 



^ M Area(R). 


This estimate is also true for an integrable function /. Does your proof apply 
to this more general case? It should. 


§ 2. Repeated integrals 

To compute the integral we shall investigate double integrals. 

Let/be a function defined on our rectangle. For each x in the interval 
[a, b ] we have a function f x of y given by f x {y) = f(x, y ), and this function 
f x is defined on the interval [c, d\. Assume that for each x the function f x 
is integrable over this interval (in the old sense of the word, for functions 
of one variable). We may then form the integral 

* 

I d My) dy = f d f(x, y) dy. 

J c J c 

The expression we obtain depends on the particular value of x chosen in 
the interval [a, b ], and is thus a function of jc. Assume that this function 
is integrable over the interval [a, b]. We can then take the integral 

J b \J d f( x > y) dx > als0 written f f f( x > y) dy dx, 

which is called the repeated integral of /. 

Example 1. Let /(jc, y) = x 2 y. Find the repeated integral of / over 
the rectangle determined by the intervals [1, 2] on the x-axis and [ — 3, 4] 
on the ,y-axis. 





[XII, §2] 


REPEATED INTEGRALS 


237 


We must find the repeated integral 

f* /_ 4 s f(x, y ) dy dx. 


To do this, we first compute the integral with respect to y, namely 



For a fixed value of x, we can take x 2 out of the integral, and hence this 
inner integral is equal to 




4 

-3 


lx 2 

2 


We then integrate with respect to x, namely 



Thus the integral of/ over the rectangle is equal to 

The repeated integral is useful in computing a double integral because of 
the following theorem, which will be proved after discussing some examples. 


Theorem 5. Let R be a rectangle [a, b ] X [c, d], and let f be integrable 

on R. Assume that for each x in [a, b] the function f x given by 

fx(y ) = f(x, y) 

is integrable on [c, d]. Then the function 

rd 

X i-> / /(*, y)dy 

J c 

is integrable on \a, b\ and 

l R f = f* [ j* f(x, y) dy~j dx. 

Geometrically speaking, the inner integral for a fixed value of x gives 
the area of a cross section as indicated in the following figure. Then 
integrating such areas yields the volume of the 3-dimensional figure 
bounded below by the rectangle R, and above by the graph of/. 
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y) dy 


Figure 7 


The following situation will arise frequently in practice. 

Let gi, g 2 be two smooth functions on a closed interval \a, b ] (a ^ b ) 
such that gi(x) ^ g 2 (x) for all x in that interval. Let c, d be numbers 
such that 

c < £i(*) ^ £2(*) < d 

for all x in the interval \a, b]. Then g u g 2 determine a region A lying 
between x = a, x = b, and the two curves y = gi(x) and y = g 2 (x). 
(Cf. Fig. 8.) 



Figure 8 


Let /be a function which is continuous on the region A, and define 
/ on the rectangle \a , b ] X [c, d] to be equal to 0 at any point of the rec¬ 
tangle not lying in the region A. For any value x in the interval [a, b ] 
the integral 

f* /(*, y) dy 

can be written as a sum: 


r (X) Ax, y) dy + f 2(X) /(x, y) dy + [* /(x, y ) dy. 

Jc Jg !(*) JgzW 
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Since/(x, y) = 0 whenever c ^ y < giO) and g 2 (x ) < y ^ d, it follows 
that the two extreme integrals are equal to 0. Thus the repeated integral 
of/ over the rectangle is in fact equal to the repeated integral 


f b \ f 2(X 7(*, y)dy]dx. 
Ja L Jg^x) J 


Regions of the type described by two functions g if g 2 as above are the 
tnost common type of regions with which we deal. 

From Theorem 5 and the preceding discussion, we obtain: 

Corollary. Let g j, g 2 be two smooth functions defined on a closed 

interval [a, b ] (a ^ b ) such that gi(x) ^ g 2 (*) f° r all x in that interval. 

Let f be a continuous function on the region A lying between x = a, 

x = b, and the two curves y = gi(x) and y = ^C*)- Then 

[ /= f b \ [° 2{X) f(x,y)dy]dx-, 

JA Ja L Joi(x) J 

in other words, the double integral is equal to the repeated integral. 

We shall give the proof of Theorem 5 below. Before doing that, we 
first give examples showing how to apply Theorem 5, or rather its 
corollary. 

Example 2. Let f(x, y) = x 2 + y 2 . Find the integral of / over the 
region A bounded by the straight line y = x and the parabola y = x 2 
(Fig. 9). 



In this case, we have gi(jc) = x 2 and g 2 (x) = x. Thus our integral is 
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equal to 

Io[£ (x ° +y2)dy ] dx - 


Now the inner integral is given by 



(x 2 + y 2 ) dy 


**' + y 




Hence the repeated integral is equal to 



x?\ , = t x 4 _ x 5 _ x 
3 ) ax - 4 + 12 5 21 

1 J_ _ 1 _ _1_ 

4 ‘ 12 5 21 


i 

o 


(We don’t need to simplify the number on the right.) 

Given a region A, it is frequently possible to break it up into smaller 
regions having only boundary points in common, and such that each 
smaller region is of the type we have just described. In that case, to com¬ 
pute the integral of a function over A, we can apply Theorem 4. 

Example 3. Let f(x, y) = 2xy. Find the integral of/ over the triangle 
bounded by the lines y = 0, y = x, and the line * -f y = 2. 

The region is as shown in Fig. 10. 



We break up our region into the portion from 0 to 1 and the portion 
from 1 to 2. These correspond to the small triangles A 1} A 2 , as indicated 
in the picture. Then 


L, f= foUo 2xydy ] dx 


and 


/,/“ f?[fo * 2xydy ] dx - 
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There is no difficulty in evaluating these integrals, and we leave them to 
you. 

Finally, we define the area of a region A to be integral of the function 
1 over A, i.e. 

Area (A) = JJ 1 dy dx. 

A 


Example 4. Find the area of the region bounded by the straight line 
y = x and the curve y = x 2 . 

The region has been sketched in Example 2. By definition, 


Area(T) = 



II 

6 ' 


We also observe that the same arguments as before apply if we inter¬ 
change therrole of x and y. Thus for the rectangle R we also have 

J r f(x, y) dy dx = fix, y) dx dy = fix, y) dx]dy. 


The same goes for a region described by functions 

x = giOO 

and 

x = g 2 (y) 

with gi ^ g 2 between y = c and y = d. 

If A is a region in the plane bounded by a finite number of smooth 
curves, and /is a function on A such that fix) ^ 0 for x e A, then we 
can interpret / as a density function, and we also call the integral j A f 
the mass of A. 

Example 5. Find the integral of the function fix, y) — x 2 y 2 over the 
region bounded by the lines y = 1, y = 2 and x = y (Fig. 11). 



Figure 11 
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We have to compute the integral as prescribed, namely: 


r f v i 

f 2 Y - 3 

/ x 2 y 2 dx 
J 0 

dy =L y2 j 


dy = / y Y dy = T 

We can also say that the preceding integral, namely 7/2, is the mass of A 
corresponding to the density given by the function /. Of course the units of 
mass are those determined by the units of density. 


Example. Find the mass of a disc of radius a if the density is propor¬ 
tional to the square of the distance from a point on the circumference. 
We take the circle surrounding the disc to have equation 

x 2 -f- y 2 = a 2 , 

and select the point on the circumference to be (a, 0), as shown on the 
figure. 



/(*, y) = k[( X - af + y 2 ]. 

The mass is therefore given by the integral 

2 f f k[(x — a ) 2 + y 2 ]dydx = %kira 4 . 

J—aJ 0 

We now give the proof of Theorem 5. We let R be the product of the 
intervals /, J so R = I X We consider a partition P = Pj X Pj of 
R, where Pi, Pj are partitions of the intervals /, J respectively. Each sub¬ 
rectangle of P can then be written as 

S — Sj X Sj, 

where 5/ is a subinterval of I and Sj a subinterval of J. Then 

Area (S) = 1(Sj)KS t ), 
where l denotes the length of an interval. 
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We denote the function 

rd 

/ /(*, y) dy 

J c 

by jjf, so that the value of this function at x is 

fj = fc ^ dy ‘ 

We have: 

L(P, X Pj, f) = E (gibs/)Area(S) 

£ 

= EZ (glb s , x s,/)Area(S, X Sj ) 
s I Sj 

(*) = s lb /(^^))^)i/(5/). 

LSy \(x,y)£S T xSj ' j 

I 

For any x in / we have 

EC gib f(.x, y))l(Sj) S Z gib f(x,y)KSj) 

Sj '(z,y)€.SiXSj ' Sj V^Sj 

= £(/»/, /«) 
g L f ” 

because each term in the expression on the left involves a gib over all 
(a, y ) rather than only over y, and thus contributes less to the sum. Thus 
the expression on the left is a lower bound for the expression on the right. 
From this we conclude that the expression on the left is ^ gib jj f x , 
and hence by (*) xes i 

L(P, f ) = UP I XPj,f)zH( gib / f.) KSi) 

s T \xeSiJj / 

= £ (M/) 

s U(P I ,J J /)- 

By similar arguments applied to upper sums instead of lower sums, we 
conclude that 

UP, f)SL (Pj, fj) S U (P,, j j f) S V(P, X Pj, f) = U(P, /). 

Since /is assumed to be integrable, it follows that for suitable partitions, 
L(P,f ) and U(P,f) are arbitrarily close together. Thus the lower sums 
for the function J y / and the upper sums for this function are arbitrarily 
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close for suitable partitions P. This implies that the function fj f is inte¬ 
grate, and the fact that these lower sums and upper sums are squeezed 
between L(P,f ) and U(P,f ) shows that the double integral is equal to 
the repeated integral, as desired. 


Exercises 


1. Find the value of the following repeated integrals. 


(a) [ f (* -f- y) dx dy (b) [ [ y dy dx (c) f f \/x dx dy 

Jo J l Jo Ji Jo Jy 2 

rir rx r 2 ry 1 ri r rsinx 

(d) / x sin y dy dx (e) dx dy (Oil y dy dx 

JO Jo J 1 Jy Jo Jo 


r t/2 r2 2 

(g) / r co 

Jo Jo 

rarctan3/2 r* 

0 ) f f 

Jo Jo 


r cos 9 dr dd 


n j 

) 


27r /•!—cos 0 


r 3 cos 2 9 dr dd 


'•arctan 3/2 r2 sec 


r dr dd 


2. Sketch the regions described by the following inequalities. 

(a) \x\£l,-l (b) \x\ ^ 3, \y\ S 4 

(c) x + y ^ 1, x ^ 0, y ^ 0 (d) 0 ^ y ^ |jc|, 0 ^ x ^ 5 

(e) 0 ^ x ^ y,0 ^ y ^ 5 (f) \x\ + \y\ ^ 1 

3. Find the integral of the following functions. 

(a) x cos(jc + y) over the triangle whose vertices are (0, 0), (7r, 0), and 
(tt, tt). 

(b) e x+y over the region defined by \x\ + |y| ^ 1. 

(c) x 2 — y 2 over the region bounded by the curve y = sin x between 0 
and tt. 

(d) x 2 + y over the triangle whose vertices are ^), (1, 2), (1, —1). 

4. Find the integrals of the following functions over the indicated region. 

(a) f(x, y) = x over the region bounded by y = x 2 and y = x 3 . 

(b) f{x, y) = y over the same region as in (a). 

(c) f(x, y) = x 2 over the region bounded by y = jc, y = 2x, and x = 2. 


5. Let a be a number > 0. Show that the area of the region consisting of all 
points (x, y) such that |jc| + \y\ ^ a, is (2a) s /2!. 

6. Find the following integrals and sketch the region of integration in each case. 

[2 fX 3 r 1 f 2x , 

(a) J ^ xdydx (b) ] ] e +v dy dx 

r2 r3 rir/2 ry 

( C > Jo Ji X ^ sinydxdy (d) J J sin xdxdy 
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/ I r\x\ rr!2 rcosy 

/ dy dx (f) / / x sin y dx dy 

-i Jo Jo Jo 

7. Sketch the region defined by x ^ 0, x 2 + y 2 ^ 2, and x 2 + y 2 ^ 1. 
Determine the integral of/(*, y) = x 2 over this region. If you wait till you 
study polar coordinates in the next section, you will do this exercise more 
easily. 

8. Integrate the function / over the indicated region. 

(a) f(x, y) = l/(x + y) over the region bounded by the lines y = x, 
x = 1, x = 2, y = 0. 

(b) f(x, y) = x 2 — y 2 over the region defined by the inequalities 

0 ^ x ^ 1 and x 2 — y 2 ^ 0. 

(c) fix, y) = x sin xy over the rectangle 0 ^ x ^ ir and 0 ^ y ^ 1. 

(d) f(x,y ) — x 2 — y 2 over the triangle whose vertices are (—1, 1), (0,0), 

0 , 1 ). 

(e) f{x, y) = 1 fix + y + 1) over the square 0^jc^l,0^y^l. 

9. Compute the integral of the function f(x,y ) = xy over the region defined 
by the inequalities 0 x 2 + y 2 ^ 1, 0 ^ x, j ^ y, sketched below. 



§ 3 . Polar coordinates 

It is frequently more convenient to describe a region by means of polar 
coordinates than with the “rectangular” coordinates of the preceding 
section. Such a region can then be described as the image of a simpler 
region as follows. 

Let a, b be numbers with 0 ^ a ^ b ^ 2 t. Let c, d be two numbers 
with 0 ^ c ^ d. Then the inequalities 

a ^ 6 ^ b and c ^ r ^ d 
describe a rectangle in the ( r , 0)-plane. Under the map 

G: R 2 -► R 2 
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given by G(r, 9 ) = (r cos 9, r sin 9), that is 

x = r cos 9, y = r sin 9, 

this rectangle goes into a circular region as shown in Fig. 12. 



The preceding map G is called the polar coordinate map. 

Consider partitions 

a = 9i ^ 0 2 ^ ^ 9 n = b, c = /*i ^ r 2 ^ ^ r m = d 

of the two intervals [a, b ] and [c, d\. Each pair of intervals [9{, 9 i+ 1 ] and 
[rj, r j+ 1 ] determines a small region as shown in the following figure. 



The area of such a region is equal to the difference between the area of 
the sector having angle 0,+i — 0* and radius r,-+i, and the area of the 
sector having the same angle but radius rj. The area of a sector having 
angle 0 and radius r is equal to 



9r 2 

T ' 
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Consequently the difference mentioned above is equal to 


(flj+1 (^f+i &j)rj __ 


— (0i+i _ 0i) 


fa+i + rj) 


(r j+l - rj). 


We note that 


n +1 + rj _ _ 
2 


>7 


and the area of the small region is therefore equal to 

Tj(r j+l - rj)(9 i+l - 0 »). 

If / is a function on the (x, y)-plane, it determines a function of (r, 6) 
by the formula 

f*(r, 6) = /(r cos 0, r sin 9). 


Then 

m n 

X X /Vi, 0i>i(/-y+i - rj)(9 i+ 1 - 0 t ) 

i=l i'=l 

is a Riemann sum on the product [a, b ] X [c, </]. Consequently the 
following theorem is now very plausible. 

Theorem 6. Let R = [a, b ] X [c, d] be as above, and let G be the polar 
coordinate map. Let f be bounded and continuous on G(R), except pos- 
sibly at a finite number of smooth curves. Let f be the corresponding 
function of (r, 9). Then 


fff*(r, 9)r dr d9 = ff/(x, y) dy dx. 

R G(R) 

In the next chapter, we shall state another theorem which gives another 
justification for this change of variables formula. We do not prove any 
of these statements in this course, since the rigorous proofs depend on 
more developed techniques. 

' As with rectangular coordinates, we can deal with more general regions. 
Let g i, g 2 be two smooth functions defined on the interval [a, b] and 
assume 

0 ^ gi(9) S g 2 (9) 

for all 9 in that interval. Let A be the region consisting of all points ( 9, r) 
such that a ^ 9 ^ b and g\{9) ^ r ^ g 2 (0). We can select two num- 
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bers c, d ^ 0 such that 


c ^ gi(0) ^ g 2 (d) ^ d 

for all 9 in the interval [c, d]. Let /be continuous on G(A ) and extend / to 
the circular sector of radius d between 9 = a and 9 = b by giving it the 
value 0 outside G(A). Then the integral of Theorem 6 taken over this sector 
is equal to the repeated integral 

f b f° 2(9) f* (9, r)r dr d9. 

Ja Jqi(B) 

The following picture shows a typical region G(A) under consideration. 
The important thing to remember about the formula of Theorem 6 is 
the appearance of an extra r inside the integral. 



We also remark that a region could be described by taking 9 as a func¬ 
tion of r, and letting r vary between two constant values. In view of 
Theorem 5, we can evaluate the double integral of Theorem 6 by repeated 
integration first with respect to 9 and then with respect to r. 

In dealing with polar coordinates, it is useful to remember the equation 
of a circle. Let a > 0. Then 

r = a cos 9, — v/2 fS 9 ^ t/2, 

is the equation of a circle of radius a/2 and center {a/ 2, 0). Similarly, 

r = a sin 9, 0 ^ 9 ^ ir, 

is the equation of a circle of radius a/2 and center (0, a/ 2). You can 
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easily show this, as an exercise, using the relations 

r - Vx 2 + y 2 , x = r cos 9, y = r sin 9. 

{Note. The coordinates of the center above are given in rectangular 
coordinates.) 

Example. Find the integral of the function /(x, y) = x 2 over the 
region enclosed by the curve given in polar coordinates by the equation 

r = (1 — cos 9). 

The function of the polar coordinates {r, 9) corresponding to / is given 

by 

f* ( r, 0) = r 2 cos 2 9. 

The region in the polar coordinate space is described by the inequalities 
0 S r ^ 1 — cos 9 and 0^0^ 27r. 

This region in the (x, >>)-plane looks like this: 



The desired integral is therefore the integral 


\ 



r z cos 2 9 dr d9. 


We integrate first with respect to r, which is easy, and see that our integral 
is equal to 



5(1 — cos 0) 4 cos 2 9 d9. 


The evaluation of this integral is done by techniques of the first course in 



250 


MULTIPLE INTEGRALS 


[XII, §3] 


calculus. We expand out the expression of the fourth power, and get a 
sum of terms involving cos fc 9 for k = 0, . . . , 6 . The reader should know 
how to integrate powers of the cosine, using repeatedly the formula 


cos 2 9 = 


1 + cos 2 9 
2 


5 


or using the recursion formula in terms of lower powers. No matter 
what method the reader uses, he will find the final answer to be 


49tt 
32 ‘ 


Exercises 


1. By changing to polar coordinates, find the integral of e x2+y 2 over the region 
consisting of the points (jr, y) such that x 2 + y 2 ^ 1 . 

2. Find the volume of the region lying over the disc x 2 + (p — l ) 2 ^ 1 and 
bounded from above by the function z = x 2 + y 2 . 

3. Find the integral of e~ (x2+y2) over the circular disc bounded by 

x 2 + y 2 = a 2 , a > 0. 


4. What is 



e~ (x2+v2) dx dyl 


9 

5. Find the mass of a square plate of side a if the density is proportional to 
the square of the distance from a vertex. 

6 . Find the mass of a circular disk of radius a if the density is proportional 
to the square of the distance from a point on the circumference. 

7. Find the mass of a plate bounded by one arch of the curve y = sin x, and 
the x-axis, if the density is proportional to the distance from the x-axis. 

Evaluate the following integrals. Take a > 0. 


8 . 



9. f f ^ ^ (x 2 -f y 2 ) dx dy 
Jo Jo 

ral V2~ r -y/a 2 — y 2 

10 . J J x dx dy 


11. Find the area inside the curve r = a( 1 + cos 9) and outside the circle r = a. 
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12. The base of a solid is the region of Exercise 11 and the top is given by the 
function fix , y) = x. Find the volume. 

13. Find the area enclosed by the curve r 2 = 2 a 2 cos 29. 

14. The'base of a solid is the area of Exercise 13, and the top is bounded by 
the function (in terms of polar coordinates) f(r, 6) — \/2a 2 — r2. Find 
the volume. 

15. Find the integral of the function 

= ( X 2 + y2 + 1)3/2 

over the disc of radius a centered at the origin. Letting a tend to infinity, 
show that 


/ QO f QO 

I f(x, y) dy 
—00 J —00 


dx = 2t. 


16. Answer the same question for the function 


/(*, y) 


l 

( x 2 + y2 -f- 2)2 ' 


17. Find the integral of the function 


f(< *’ y) (x* + y 2 ) 3 

over the region between the two circles of radius 2 and radius 3, centered 
at the origin. 

18. (a) Find the integral of the function f(x, y) = x over the region bounded 

in polar coordinates by r — 1 — cos 9. 

(b) Let a be a number > 0. Find the integral of the function f(x, y) = x 2 
over the region bounded in polar coordinates by r = a( 1 — cos 9). 

19. Sketch the region defined by x ^ 0, x 2 + y 2 ^ 2 and * 2 + T 2 = 1. 
Determine the integral of the following functions over this region. 

(a) f(x, y) = * 2 (b) f{x, y) = x (c) fix, y) = y. 

20. Let n be a positive integer, and let fix, y) = 1 /r n , where r = \/x2 + y 2 . 

(a) Find the integral of this function over the region contained between two 
circles of radii a and b respectively, with 0 < a < b. 

(b) For which values of n does this integral approach a limit as a —» 0? 


§4. Triple integrals 

The entire discussion concerning 2-dimensional integrals generalizes to 
higher dimensions. We discuss briefly the 3-dimensional case. 

A 3-dimensional rectangle (rectangular parallelepiped) can be written 
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as a product of three intervals: 

* = [°u b x ] X [a 2 , b 2 ] X [a 3 , b 3 ]. 


It looks like this. 


Z 



A partition P of R is then determined by partitions P ls P 2 , P 3 of the three 
intervals respectively, and partitions R into 3-dimensional subrectangles, 
which we denote again by S. 

If / is a bounded function on R, we may then form upper and lower 
sums. Indeed, we define the volume of the rectangle R above to be the 
3-dimensional volume 

Vol(P) = (b 3 - a 3 )(b 2 - a 2 )(bi - ai) 
and similarly for the subrectangles of the partition. Then we have 

L(P, f) = E (gibs/) Vol(S), 

s 

V(P,f> = £ (lubs/) Vol(S). 

s 

A refinement P' of P is determined by refinements P[, P 2 , P 3 of Pi, P 2 , P 3 
respectively, and the Lemma of §1 extends to this case. 

Again, we say that /is integrable if the least upper bound of the lower 
sums is equal to the greatest lower bound of the upper sums, and if this 
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is the case, we define it to be the integral of/ over R, written 



/ = ffff(x,y,z)dzdydx 

R 


if the variables are x, y, z. 

If / ^ 0, then we interpret this integral as the 4-dimensional volume of 
the 4-dimensional region lying in 4-space, bounded from below by R, 
and from above by the graph of /. Of course, we cannot draw this figure 
because it is in 4-space, but the terminology goes right over. 

Theorem 1 and Theorem 2 are again valid, that is the integral is linear, 
and satisfies the usual inequality. 

The criterion of Theorem 3 for a function to be integrable also has an 
analogue. In this case, however, we have to parametrize the boundary 
of a 3-dimensional region by 2-dimensional smooth pieces of surfaces. 
Thus let T be a 2-dimensional rectangle, and let 


F: T-> R 3 

be a map. If F is of class C 1 we shall say that F is smooth, and we call 
the image of Fa smooth surface (Fig. 17). 



Figure 17 

The analogue of Theorem 3 is then: 

Let R be a 3 -dimensional rectangle, and let f be a function defined on R, 
bounded and continuous except possibly at the points lying on a finite number 
of smooth surfaces. Then f is integrable on R. 

Again we can integrate over a more general region than a rectangle, 
provided such a region A has a boundary which is contained in a finite 
number of smooth surfaces. Then Theorem 4 holds. If A denotes a 
3-dimensional region and / is a function on A, we denote the integral of 
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/ over A by 



fff fix, y, z) dz dy dx. 

A 


If we view A as a solid piece of material, and/is interpreted as a density 
distribution over A, then the integral of / over A may be interpreted as 
the mass of A. 

To compute multiple integrals in the 3-dimensional case, we have the 
same situation as in the 2-dimensional case. 

The theorem concerning the relation with repeated integrals holds, so 
that if R is the rectangle given by 


R = [a\,bi] X [a,b 22 ] X [a 3 ,b 3 ], 


then 




dx. 


Of course, the repeated integral can be evaluated in any order. 


Example 1. Find the integral of the function f(x, y, z) = sin x over the 
rectangle 2^j^3, and —1 ^ z ^ 1. 

The integral is equal to 


u:l sin x dz dy dx. 


If we first integrate with respect to z, we get z\L x = 2. Next with respect 
to y, we get y\l = 1. We are then reduced to the integral 


rTT v 

/ 2 sin x dx = —2cosx = —2(cos7r — cosO) — 4. 

J 0 0 

We also have the integral over regions determined by inequalities. 

Case 1. Rectangular coordinates. Let a , b be numbers, a ^ b. Let gi, 
g 2 be two smooth functions defined on the interval [a, b~\ such that 

giO) ^ gzix), 

and let hfix,y) ^ h 2 (x, y) be two smooth functions defined on the region 
consisting of all points (x, y) such that 

a ^ x ^ b and g x (x) ^ v Z, g 2 (*)- 

Let A be the set of points (x, y, z) such that 

a ^ x ^ b, gi(x) ^ y ^ g 2 (x), and hfx,y) ^ z ^ h 2 (x,y). 
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Let f be continuous on A. Then 


f f = f r r (x) ( r y) /(x ’ * z) 

Ja L Joi(x) \Jhi(x f y) / J 


For simplicity, the integral on the right will also be written without the 
brackets. 

Example. Consider the tetrahedron T spanned by 0 and the three unit 
vectors (Fig. 18). 



This tetrahedron is described by the inequalities: 

0 ^ x g 1, 0 ^ 5^ 1 — x, 1 — x — y. 

Hence if/ is a function on the tetrahedron, its integral over T is given by 

/ /= f 1 [ l X f 1 V f(x, y, z) dz dy dx. 

Jt Jo Jo Jo 

For the constant function 1, the integral gives you the volume of the 
tetrahedron, and you should have no difficulty in evaluating it, finding the 
value J. 

' Case 2. Cylindrical coordinates. Analogously to the polar coordinate 
map in 2-space, we consider the cylindrical coordinate map in three space, 
given by 

G(r, 0, z) = (r cos 0, r sin 0, z). 


In other words, 


r 


x — r cos 0 
y = r sin 0 


z = z. 
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The image of a box B defined by the inequalities: 

0 5S 0 ^ 0 2 — 2ir, 0 ^ ri ^ r ^ r 2 , and z, ^ z ^ z 2 , 
under the map G is shown in the following picture. 



The volume of the elementary region on the right, which is the image of 
the box under the cylindrical coordinate map, is equal to the area of the 
base times the altitude, and is therefore equal to 

(z 2 - zj) (fl 2 - flj). 

This expression can be rewritten in the form 


r(z 2 - z t )(r 2 - r x '){0 2 - 6i), 

where 

? = r J*±I±. 

2 

Therefore if a function fs given in terms of the rectangular coordinates 
over some region, which is the image G(A) of some region in the ( r , 0, z)- 
space, then its integral is given in terms of the cylindrical coordinates by 


/// ^ X ’ y ’ dz d y dx 

G(A ) 



cos Q, r sin 6, z)r dz dr dd. 


A 


Indeed, the same kind of argument applies as with polar coordinates. 
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A region may be described by inequalities given by functions. For 
instance, let A be the region in the ( 0 , r, z)-space consisting of all points 
( 0 , r, z) satisfying conditions: 

a ^ 0 ^ b, (b 25 a + 2w), 

0 ^ g i(0) ^ r ^ g 2 (0), 

u’/Y/z smooth functions g\, g 2 defined on the interval [a, b], and 

h\(6, r) <: z g> h 2 (6, r), 

with smooth functions h\, h 2 defined on the 2-dimensional region bounded 
by 0 = a, 0 = b, and g lt g 2 , i.e. the region consisting of all points ( 0 , r) 
satisfying the above inequalities. Let G be the map of cylindrical coordinates 
given above. Let fbea continuous function on the region G(A) in the (x, y, z)- 
space. Let 

/*( 0 , r, z) = f(r cos 0 , r sin 0 , z). 

Then 


The function which we denote by /* may be viewed as the function / in 
terms of the cylindrical coordinates. 

Example. Find the mass of a solid bounded by the polar coordinates 
7 r /3 ^ 0 ^ 27r/3 and r = cos 0 and by z = 0, z = r, if the density is 
given by the function 

/* 0 , 0 , 2 ) = 3 r. 

The mass is given by the integral 

/•27T/3 /*cos0 tr 

I I I 3r ■ r dz dr dd. 

Jtt/3 Jo Jo 

Integrating the inner integral with respect to z yields 3 r 2 r = 3 r s . Inte¬ 
grating with respect to r between 0 and cos 0 yields 

3r 4 cos0 = 3 cos 4 0 _ 

4 0 4 

Finally we integrate with respect to 0 , using elementary techniques of 
integration: cos 2 0 = (1 + cos 20)/2 so that 

cos 4 0 = 5(1 + 2 cos 20 + cos 2 20 ) 

I/. . _ , 1 + cos40\ 

— ^ ( 1 + 2 cos 20 -|- 2 -) ' 
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Case 3. Spherical coordinates. We consider the region in coordinates 
(p, 9, if) described by 


0 ^ p, 0 ^ ^ 7T, 0 ^ 0 ^ 2 t r. 


These coordinates can be used to describe a point in 3-space as shown on 
the following picture. 



In fact, we let 

p = V x 2 + y 2 + z2. 

We denote this by p to distinguish it from the polar coordinate r in the 
(x, >>)-plane. We see that 

x 2 + y 2 = p 2 - z 2 = p 2 sin 2 

so that the polar r is given by 

r — a/x^ + y 2 = p sin <p. 

In taking the square root, we do not need to use the absolute value |sin <p\ 
because we take 0 ^ (f ^ it so that sin <p ^ 0 for our values of ip. 

We can now integrate this between the given limits, and we find 

\L c ° s 4 s</s= ^(?- v 3 +i+x)- 

Note. In the above example, the function is already given in terms of 
(r, 9, z). It corresponds to the function f(x, y, z) = 3\/ x 2 -f y 2 . Indeed, 
taking f(r cos 9, r sin 9, z) yields 3r. 

Example. Let us find the volume of the region inside the cylinder 
r = 4 cos 9 , bounded above by the sphere r 2 + z 2 = 16, and below by 
the plane z = 0. In the (x, j)-plane, the equation r = 4 cos 9 is that of a 
circle, with 0 ^ 9 ^ ir. The region is then defined by means of the other 
two inequalities 

0 ^ z ^ 16 — r 2 and 0 ^ r ^ 4 cos 9. 
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Therefore the desired volume V is the integral 

fir r 4 cos e r Vl6-r 2 

V = / / f r dz dr dd 

Jo Jo Jo 

/•7T f 4 COS 0 , 

= f I r\/ 16 — r 2 dr dd 

Jo Jo 

= - r ( Sin 3 0 _ I),# = ^(3 tt - 4). 

Jo 

From the formulas x = r cos 0 and y = r sin 0, we then obtain the 
relationship between (x, y, z ) and (p, 0, <p), namely: 

x = p sin <p cos 6, 
y = p sin ^ sin 0, 
z = p cos <p. 

We can also say that we have a mapping G: R 3 —» R 3 given by 

G(p, 0, ip) = (p sin <p cos 6, p sin $ sin 0, cos <p), 

Let R be the 3-dimensional rectangle in the (p, 0, <p)-space described by 
the inequalities: 



The image of R under the map G 
as shown in the next picture. 



then an elementary spherical region 



\ 

x 0i 


Figure 21 
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The volume of the elementary spherical region G(R ) just described is 
equal to 



In order to see this, we shall find the volume of a slightly simpler region, 
namely that lying above a cone and inside a sphere as shown on the next 
figure. 



The radius of the sphere is p, and the angle of the cone is <p t as shown on the 
figure. We let a be the height at which the cone meets the sphere. The 
volume of this region consists of two pieces. The first is the volume of a 
cone of height a, and whose base is 

b — p sin ip. 

Observe that a = p cos <p. The volume of this cone is therefore equal to 

7T 3 . 9 

-p Sin <p COS (f. 

The other piece lies below the spherical dome, and can be obtained as a 
volume of revolution of the curve x 2 -f- y 2 — p 2 , letting x: range between 
a and p. You should know how to do this, and you will find the answer 

7r(§ p 3 — p 3 COS <p + ^p 3 COS 3 ip). 

Adding our two volumes together, and noting that 

COS 3 if = COS 2 if COS if, 

we find that the volume of the region lying above the cone and inside the 
sphere is equal to 

§7rp 3 — §7rp 3 cos ip. 
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The volume of this region lying between angles <p x and ip 2 is obtained by 
subtracting, and is equal to 

§7rp 3 (cos <p x — cos <p 2 ). 

Considering only the part lying between the spheres of radii pi and p 2 , 
we obtain its volume again by subtraction, and get 

§ir(p2 - Pi)(cos <p ! - cosv? 2 ). 

Finally, we have to take that part lying between angles 0 : and 0 2 , that is, 
take the fraction 

02 ~ #1 
2t 

of this last volume. In this way, we obtain precisely the desired volume of 
the elementary spherical region of Fig. 21. 

Using the mean value theorem, we find that 

f - d = p*( P2 - po, 

for some number p between pi and p 2 . Again by the mean value theorem, 
we find that 

cos <p i — cos <f 2 = sin ip (<p 2 — <p\). 

Hence the volume of the elementary spherical region G(R ) is equal to 

p 2 sin <p (p 2 — pi)(<p 2 — <p i)(0 2 — 0i)- 

By forming Riemann sums we already had in polar coordinates, it is 
therefore reasonable that the triple integral of a function / over a region 
G(A), which is obtained as the image under the spherical coordinate map, 
can be expressed in terms of spherical coordinates by the formula: 


fjf 9 ’ p2 sin * dp dtf dd = ///^ x ’ y ’ 2 ^ dz dy dx ' 

A G(A) 


As usual, /(C(p, 0, v?)) = /*(p, 0, ip) is the value of the function at the 
given point ( x , y, z) in terms of the spherical coordinates of the point 

(p, 0, <p). 


Example. As a check, let us apply the general formula directly to see 
if it gives us the same answer for the volume of the elementary spherical 
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region G(R). We are supposed to evaluate the integral 


ffjdzdy 

G(R) 


dx 



sin <p dp d^ dd 


In this case, the repeated 3-fold integral splits into separate integrals with 
respect to p, <p, 6 independently. These integrals are of course very simple 
to evaluate. In this case, the limits of integration are constant. Integrating 
with respect to p yields the factor ^(p 2 — Pi). Integrating with respect to 
<p yields the factor (cos <p x — cos <p 2 ). Integrating with respect to 6 yields 
the factor (0 2 — 0!). Thus the evaluation of the integral checks with the 
arguments given previously. 


Example. Find the volume above the cone z 2 = x 2 + y 2 and inside 
the sphere x 2 + y 2 + z 2 = 1 (Fig. 23). 



The equation for the sphere in spherical coordinates is obtained by the 
values 


and 


p 2 = x 2 -\- y 2 z 2 
Z — P COS if. 


Thus the sphere is given by the equation 


p = COS ip. 

The cone is given by cos 2 ip = sin 2 <p, and since 0 ^ <p ^ tt this is the 
same as <p = 7r/4. Thus the region of integration is the image under the 
spherical coordinate map of the region A described by the inequalities: 


0 ^ 0 ^ 2w, 0 ^ ip S ■ jt / 4 , 


0 ^ p ^ cos tp. 


Hence our volume is equal to the integral 




ip dp dip dd. 
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The inside integral with respect to p is equal to 


(sin <p)(j 


COS (P | 

= -z cos 3 <p sin <p. 
n 3 


This is now easily integrated with respect to <p, and yields 

1 


1 —cos 4 <p 
3 4 


" 4 = _L(_I + A = A 

0 12 V 4 + ) 48 


Finally, we integrate with respect to 0, and the final answer is therefore 
equal to 


• 2tt = i7T. 


Example. Find the mass of a solid body S determined by the in¬ 
equalities of spherical coordinates: 

0^0^^’ ^ <p ^ arctan 2, 0 ^ p ^ \/6, 


if the density, given as a function of the spherical coordinates (p, 0, <p), 
is equal to 1 /p. 

To find the mass, we have to integrate the given function over the 
region. The integral is given by 


n arctan 2 rJo 

/ - p 2 sin <p dp dd. 

/ 4 Jo P 


Performing the repeated integral, we obtain 


3 IT 
2 




We note that in the present example, the limits of integration are con¬ 
stants, and hence the repeated integral is equal to a product of the integrals 

/•7r/2 rarctan2 rV0 

/ dd • / sin <p dtp ■ / p dp. 

Each integration can be performed separately. Of course, this does not 
hold when the limits of integration are non-constant functions. 


As before, we have a similar integral when the boundaries of integra¬ 
tion are not constant. We state the result: 


Let a, b be numbers such that 0 ^ b — a ^ 2ir. Let gi(0), g 2 ( 0 ) be 
smooth functions of 0, defined on the interval a S 0 = b such that 

0 ^ gi(0) ^ g 2 (0) ^ it. 

Let hi, h 2 be functions of two variables, defined and smooth on the region 
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consisting of all points (0, <p) such that 

a ^ 0 ^ b, 
gM g 2 (0) 

and such that 0 ^ h^O, tp) ^ h 2 (d, (p) for all (0, <p) in this region. Let A 
he the 3-dimensional region in the (0, <p, p)-space consisting of all points 
such that 

a 0 ^ h, 
gi(9) ^ <P ^ g 2 (0), 
hfd, <p) ^ p ^ h 2 (d, <p). 

Let G be the spherical coordinate map, and let f be a continuous function 
on G(A). Let f*(9, cp, p) = f{G(9 , <p, pf). Then 



Exercises 

1. Find the volume inside the sphere 

x 2 + y 2 + z 2 = a 2 . 


by using spherical coordinates. 


2. Find the integral 



dz dp d9. 


3. (a) Find the mass of a spherical ball of radius a > 0 if the density at any 

point is equal to a constant k times the distance of that point to the center, 
(b) Find the integral of the function 

fix, y, z) = --= - 

V x 2 + y 2 + z 2 

over the spherical shell of inside radius a and outside radius 1. Assume 
0 < a < 1. What is the limit of this integral as a —* 0 ? 

4. Find the mass of a spherical shell of inside radius a and outside radius b if 
the density at any point is inversely proportional to the distance from the 
center. 

5. Find the integral of the function 


fix, y, z) = * 2 





[XII, §4] 


TRIPLE INTEGRALS 


265 


over that portion of the cylinder 

JC 2 + y 2 = a 2 

lying between the planes 

z = 0 and z — b > 0. 

6. Find the mass of a sphere of radius a if the density at any point is propor¬ 
tional to the distance from a fixed plane passing through a diameter. 

7. Find the volume of the region bounded by the cylinder y = cos x, and the 
planes 

z = y, x = 0, x = 7t/2, and z = 0. 

8. Find the volume of the region bounded above by the sphere 

x 2 + y 2 + z 2 = 1 

and below by the surface 

z = x 2 + y 2 . 

9. Find the volume of that portion of the sphere x 2 + y 2 + z 2 = a 2 , which 
is inside the cylinder r = a sin 9, using cylindrical coordinates. 

10. Find the volume above the cone z 2 = x 2 + y 2 and inside the sphere 
p = 2a cos <p (spherical coordinates). [Draw a picture. What is the center 
of the sphere? What is the equation of the cone in spherical coordinates?] 

11. Find the volumes of the following regions, in 3-space. 

(a) Bounded above by the plane z = 1, and below by the top half of 
z 2 = x 2 + y 2 . 

(b) Bounded above and below by z 2 — x 2 + y 2 , and on the sides by 
x 2 + y 2 + z 2 = 1. 

(c) Bounded above by z — x 2 + y 2 , below by z = 0, and on the sides 
by x 2 + y 2 = 1. 

(d) Bounded above by z = x, and below by z = x 2 + y 2 . 

12. Find the integral of the following functions over the indicated region, in 
3-space. 

(a) /( x, y, z) = x 2 over the tetrahedron bounded by the plane 

12 a : + 20y + 15z = 60, 

and the coordinate planes. 

(b) /(jc, y, z) = y over the tetrahedron as in (a). 

(c) f(x, y, z) = 7yz over the region on the positive side of the (jc, z)-plane, 
bounded by the planes y = 0, z = 0, and z — a (for some positive 
number a), and the cylinder x 2 + y 2 = b 2 ( b > 0). 

13. Find the volume of the region bounded by the cylinder r 2 = 16, by the plane 
z = 0, and below the plane y = 2z. 

14. Let n be a positive integer, and let f(x, y, z) = 1/p", where 


p = V x2 + y2 -)- z2. 
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(a) Find the integral of the function 

f(x, y, z) = I/p* 

over the region contained between two spheres of radii a and b res¬ 
pectively, with 0 < a < b. 

(b) For which values of n does this integral approach a limit as a —> 0? 
Compare with the similar result which you may have Worked out in the 
preceding section for a function of two variables. 


§5. Center of mass 

Double and triple integrals have an application to finding the center of 
mass of a body in the plane or in 3-space. Let A be such a body, say in the 
plane, and let/ be its density function, giving the density at every point. 
Let m be the total mass. Let ( x , T) be the coordinates of the center of mass. 
Then they are given by the integrals: 


35 = i!l 


x f ( x, y ) dy dx = 


// 

A 

ii 


xf(x, y) dy dx 


f(x, y) dy dx 


y = ill 


yf(x, y) dy dx = 


Jjfyf(x, y) dy dx 

A 


IJn*, 


y) dy dx 


In 3-space, we would of course use the triple integral of xf(x,y,z) and 
yf ( x , y, z) over the body. For instance, the third coordinate of the center 
of mass of a body of total mass m in 3-space is given by 


z 



z f (x, y, z) dx dy dz. 


Example. Let us find the center of mass of the part of the first quadrant 
lying in the disc of radius 1, as shown on Fig. 24. We assume in this case 
that the density is uniform, say equal to 1. 



Figure 24 
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The total mass m is equal to 7r/4, and 


x = 



x dy dx. 


The integral is best evaluated by changing variables, i.e. using polar 
coordinates. Thus we find: 


Hence 


//■ 


x dy dx 



r cos Or dr dd 


i 

3 ‘ 


X = 


4 

3tt' 


Similarly, or by symmetry, we have y = — also. 

3ir 


Example. Let us find the z-coordinate of the center of mass of the part 
of the unit ball consisting of all points (x, y, z) whose coordinates are ^ 0. 
If A denotes this part of the ball, then we have 


~ z = illl 


z dx dy dz. 


By using spherical coordinates, the integral is equal to 

•■7r/2 ricf 2 r 1 


f f f p cos (p p 2 sin <p dp dip dO. 
Jo Jo Jo 


Again we easily find the value 7r/16. We also know that the mass of the total 

1 AlV 7T 

ball is f7r. Hence the mass of our part of the ball is - • — = -, so that 


7T 6 _ 3 

T6't ~ 8 


Exercises 

In each of the following cases, find the center of mass of the given body, assuming 
that the density is equal to 1. 

1. The triangle whose vertices are (0, 0), (3, 0), and (0, 5). 

2. The region enclosed by the parabola y — 6x — x 2 and the line y = x. 
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3. The upper half of the region enclosed by the ellipse as shown on the figure. 



4. The region enclosed by the parabolas y — 2x — x 2 and y = 3x 2 — 6x. 

5. The region enclosed by one arch of the curve y = sin x. 

6. The region bounded by the curves y = sin* and y = cos a, for 0 ^ x ^ 7t/4. 

7. The region bounded by y = log x and y — 0, 1 S x S a. 

8. The inside of a cone of height h and base radius r, as shown on the figure. 



9. Find (a) mass and (b) the center of mass of a plate bounced by the upper 
half of the curve r — 2(1 + cos 9) (in polar coordinates) if the density is 
proportional to the distance from the origin. The plate is drawn on the next 
figure. 



10. Find (a) the mass and (b) the center of mass of a right circular cylinder of 
radius a and height h if the density is proportional to the distance from the 
base. 
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11. (a) Find the mass of a circular plate of radius a whose density is propor¬ 

tional to the distance from the center. 

(b) Find the center of mass of this plate. 

(c) Find the center of mass of one quadrant of this plate. 

12. Find the mass of a circular cylinder of radius a and height h if the density 
is proportional to the square of the distance from the axis. 

13. Find the center of mass of a cone of height h and radius of the base equal to 
a, if the density is proportional to the distance from the base. 




CHAPTER XIII 


The Change of Variables Formula 


§ 1 . Determinants as area and volume 

We shall study the manner in which area changes under an arbitrary 
mapping by approximating this mapping with a linear map. Therefore, 
first we study how area and volume change under a linear map, and this 
leads us to interpret the determinant as area and volume according as 
we are in R 2 or R 3 . 

Let us first consider R 2 . Let 

M M 

A = [ and B = ( t 

w w 

be two non-zero vectors in the plane, and suppose that they are not scalar 
multiples of each other. We have already seen that they span a parallelo¬ 
gram, as shown on Fig. 1. 



Figure 1 


Theorem 1 in R 2 . Let A, B be non-zero elements of R 2 , which are not 
scalar multiples of each other. Then the area of the parallelogram spanned 
by A and B is equal to the absolute value of the determinant \D(A, B )|. 

Proof. We assume known that this area is equal to the product of the 
lengths of the base times the altitude, and this is equal to 


Ml Ml |sin e\, 

where 6 is the angle between A and B (i.e. between OA and OB). This is 
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illustrated on Fig. 2. 



Figure 2 


Note that 

|sin 6\ = Vl — cos 2 6, 

and recall from the theory of the dot product that 


We have 


cos 9 = 


A • B 

Ml 1*1' 


Area of parallelogram = \A\ |2?| * 1 


(A • B) 2 


\A | 2 \B\ 2 
= V\A\ 2 \B\ 2 - (A - By. 


All that remains to be done is to plug in the coordinates of A and B to see 
what we want come out. Indeed, the above expression is equal to the 
square root of 

(i a 2 + c 2 )(b 2 + d 2 ) — (ab + cd) 2 . 


If you expand this out, you will find that this last expression is equal to 

(ad - be) 2 . 


Consequently, the area of the parallelogram is equal to 

y/(ad — be) 2 = \ad — bc\ = | D(A, Z?)|. 
This proves our assertion. 


Theorem 1 in R 3 . Let A, B , C be vectors in R 3 , and asswne that the 
segments OA, OB, OC do not all lie in a plane. Then the volume of the 
box spanned by A, B, C is equal to the absolute value of the determinant. 


\D(A, B, C)\. 
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Proof: Similar arguments to those which applied in R 2 show us that 
the area of the base of the box, spanned by A and B, is equal to 

(*) V’\A p | 

Look at Fig. 3. 



Figure 3 


The volume of the box is equal to the area of this base times the altitude, 
and this altitude is equal to the length of the projection of C along a vector 
perpendicular to A and B. You should now have read the section on cross 
products, because the simplest way to handle the present situation is to use 
the cross product. We know that A X B is such a vector, perpendicular 
to A and B. The projection of C on A X B is equal to 


_ C • (A X B) 

(A X B)- (A X B) 


A X B, 


where the number in front of A X B is the component of C along A X B 
as studied in Chapter I. Therefore the length of this projection is equal to 


(**) 


C-(AX B) | 
\A X B\ 


On the other hand, if you look at property CP 6 of the cross product in 
Chapter I, §6, you will find that (*) is equal to |A X B\. Therefore, the 
volume of the box spanned by A, B, C, which is equal to the product of (*) 
and (**), is seen to be equal to 


1C- (A X 5)1. 

All that remains to be done is for you to plug in the coordinates, to see that 
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this is equal to the absolute value of the determinant. You let 



use the definition of the cross product of A X B, and then dot with C. 
You will find precisely the six terms which give the determinant D(A, B, C ), 
up to a sign, which is killed by the absolute value. This proves Theorem 1 
in R 3 . 


Example. Let A — (3, 1) and B — (2, —5). Then the area of the 
parallelogram spanned by A and B is equal to the absolute value of the 
determinant 



Hence this area is equal to 17. Note: We wrote our vectors horizontally. 
We get the same determinant as if we write them vertically, namely 


3 

1 



because we know that the determinant of the transpose of a matrix is 
equal to the determinant of the matrix. 

We interpret Theorem 1 in terms of linear maps. Given vectors A, B 
in the plane, we know that there exists a unique linear map 


L: R 2 -*• R 2 

such that L(E X ) = A and L(E 2 ) = B. In fact, if 

A = aE 1 + cE 2 , B = bE 1 + dE 2 , 
then the matrix associated with the linear map is 

Furthermore, if we denote by S the unit cube spanned by E 1 , E 2 , and by 
P the parallelogram spanned by A, B, then P is the image under L of S, 
that is L(S) = P. Indeed, as we have seen, for 0 ^ t{ ^ 1 we have 

L(t 1 E 1 + t 2 E 2 ) = tiL(E l ) + t 2 L(E 2 ) = t x A + t 2 B. 
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Let us define the determinant of a linear map to be the determinant of 
its associated matrix. We conclude that 

(Area of P ) = |Det(L)|. 

Example. The area of the parallelogram spanned by the vectors (2, 1) 
and (3, — 1) (Fig. 4) is equal to the absolute value of 

|2 1 | 



and hence is equal to 5. 


(2,1) 




( 3 ,- 1 ) Figure 4 

Theorem 2. Let P be a parallelogram spanned by two vectors in R 2 . Let 
L: R 2 —> R 2 be a linear map. Then 

Area of L(P) = |DetL| (Area of P). ^ 

• Proof. Suppose that P is..spanned by two vectors A, B. Then L(P) is 
spanned by L(A) and L(B). (Cf. Fig. 5). There is a linear map L \: R 2 —> R 2 
such that 

LfE 1 ) = A and LfE 2 ) = B. 


Then P = Li(S), where S is the unit square, and 

L(P) = L(L t (S)) = (L°L,)(S). 


Figure 5 
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By what we proved above in (*), we obtain 

Area L(P) = |Det (L ° L x )\ = |Det (L) Det (Z-Ol - |Det (L)| Area ( P), 
thus proving our assertion. 

Corollary. For any rectangle R with sides parallel to the axes, and any 
linear map 

L: R 2 —> R 2 , 

we have 

Area L(R) = |Det (L)| Area (.R). 



Figure 6 


Proof. The rectangle R is equal to the translation of a rectangle R\ as 
shown on Fig. 6, with one corner at the origin, that is 

R = R t + A. 

Then 

L(R) = L(R 1 ) + L(A). 

The area of L(R i) is the same as the area of + L(A) (i.e. the 

translation of L(R i) by L(A)). All we have to do is apply Theorem 2 to 
complete the proof. 

Example. The area of the parallelogram spanned by the vectors 

(3, -2) and (4, 1) 

is equal to the absolute value of the determinant 

3 -2 

4 1 

The determinant is equal to 11, so this is also the area. 
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Example. The area of the parallelogram spanned by the vectors 

(3,2) and (4, 1) 
is equal to the absolute value of the determinant 

3 2 

4 1 

The determinant is equal to —5, so the area is equal to 5. 

Example. The volume of the box spanned by the vectors 
(3,0,1), (1,2,5), (-1,4,2) 

is equal to 42, because the determinant 

3 0 1 
1 2 5 
-14 2 

has the value —42. 

We can also formulate Theorem 2 and its corollary in 3-space. 

Theorem 2 in 3 space. Let P be a parallelotope (box) in 3-space, spanned 
by three vectors. Let L: R 3 —» R 3 be a linear map. Then 

Volume of L(P) = |DetL| (Volume of P). 

. Corollary. For any rectangular box R in 3-space and any linear map 
L: R 3 —> R 3 , we have 

Vol L(R) = |Det (L)| Vol (R). 

The proofs are exactly like those in 2-space, drawing 3-dimensional 
boxes instead of 2-dimensional rectangles. 

Exercises 

1. Find the area of the parallelogram spanned by 

(a) (-3, 5) and (2, -1). (b) (2, 3) and (4, -1). 

2. Find the area of the parallelogram spanned by the following vectors. 

(a) (2, 1) and (-4, 5) (b) (3, 4) and (-2, -3) 
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3. Find the area of the parallelogram such that three corners of the parallelo¬ 
gram are given by the following points 

(a) (1,1), (2,-1), (4,6) (b) (-3,2), (1,4), (-2,-7) 

(c) (2, 5), (-1, 4), (1, 2) (d) (1, 1), (1, 0), (2, 3) 

4. Find the volume of the parallelepiped spanned by the following vectors in 
3-space. 

(a) (1,1, 3), (1, 2, -1), (1, 4, 1) (b) (1, -1, 4), (1, 1, 0), (-1, 2, 5) 

(c) (-1,2, 1), (2, 0, 1), (1, 3, 0) (d) (-2, 2, 1), (0, 1, 0), (-4, 3, 2) 


§ 2 . Dilations 

This section will serve as an introduction to the general change of 
variables formula, and the interpretation of determinants as area and 
volume. 

Let r be a positive number. If A is a vector in R" (in practice, R 2 or R 3 ) 
we call rA the dilation of A by r. Thus dilation by r is a linear mapping, 

A r-> rA. 

We wish to analyze what happens to area in R 2 , and volume in R 3 , under 
a dilation. We start with the simplest case, that of a rectangle. Consider 
a rectangle whose sides have lengths a, b, as on Fig. 7(a). If we multiply 
the sides of the rectangle by r, we obtain a rectangle with sides ra, rb as on 
Fig. 7(b). The area of the dilated rectangle is equal to 

rarb = r 2 ab. 

Thus dilation by r changes the area of the rectangle by r 2 . 


b 


a 


rb 


ra 


(a) 


(b) 


Figure 7 


In general, let 5 be an arbitrary region in the plane R 2 , whose area can 
be approximated by the area of a finite number of rectangles. Then the 
area of S itself changes by r 2 under dilation by r, in other words, 

Area of rS = /* 2 (area of S). 
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For instance, let D be the disc of radius r, so that D x is the disc of radius 
1, centered at the origin (Fig. 8). Then D r = rD\. 



Figure 8 


If 7 r is the area of the disc of radius 1, then irr 2 must be the area of the disc 
of radius r. Of course, we knew this already, but we find this result here 
again from another point of view. More generally, consider a region S 
inside a curve as in Fig. 9(a), and let us draw the dilation of S by r in 
Fig. 9(b). To justify that the area changes by r 2 , we draw a grid, approxi- 
. mating the areas by squares. 




Under dilation by r, the area of each square gets multiplied by r 2 , and so 
the sum of the areas of these squares, which approximates the area of S , 
also gets multiplied by r 2 . 

The question, of course, arises as to whether the squares lying inside S, 
and formed by a sufficiently fine grid, actually approximate S. We can see 
that they do, as follows. Let the sides of the squares in the grid have 
length c. (Fig. 10a.) Suppose that a square intersects the curve which 
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(a) (b) Figure 10 

bounds S. Let Z be this curve. Then any point in the square is at distance 
at most cs/2. from the curve Z. This is because the distance between any 
two points of the square is at most cy/2 (the length of the diagonal of the 
square). Let us draw a band of width c\/2 on each side of the curve, as 
shown on Fig. 10(b). Then all the squares which intersect the curve must 
lie within that band. It is very plausible that the area of the band is at 
most equal to 

2c\/2 times the length of the curve. 

Thus if we take c to be very small, i.e. if we take the grid to be a very fine 
grid, we see that the area of the region S is approximated by the area 
covered by the squares lying entirely inside the region. This explains why 
the area of 5 will get multiplied by r 2 under dilation by r. 

We can also make a mixed dilation. Let r, s be two positive numbers. 
Consider the mapping of R 2 given by 

(*, y) ^ W)- 

Thus we dilate the first coordinate by r and the second by s. If a rectangle 
R has sides of lengths a, b respectively, then the image of the rectangle 
under this mapping will be a rectangle with sides of lengths ra, sb. Hence 
the area of the image will be 


rasb = rsab. 

Thus the area changes by a factor of rs under our mapping. 

An argument as before shows that if we submit a region S to such a 
mapping F TiS such that 

F r , s (x, y) = (rx, sy ), 

then its area will change by a factor of rs. 
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Example. We now have a very easy way of finding the area of an ellipse 
defined by an equation 



Indeed, let u = x/3 and v = y/A. Then 

u 2 + v 2 = 1, 

and the ellipse is equal to the image of the circle under the mapping 

(m, v ) i—► (3m, 4v). 

Hence the area of the ellipse is equal to 3 • 4?r = 12ir. Note how we did 
this without integration! However, the technique of the small grid is of 
course exactly the same technique which was used in the theory of the 
integral. 

We can also develop the same ideas in 3-space. Consider dilation by r 
in 3-space, namely consider the mapping 

(x, y, z ) i-> ( rx , ry, rz). 

If P is a rectangular box with sides a, b, c, then its dilation by r will be a 
* box with sides ra, rb, rc, and the dilated box will have volume 

rarbrc = r^abc. 

Thus the volume of a box changes by r 3 under dilation by r. 

Similarly, let r, s , t be positive numbers, and consider the linear map 

Fr,s,t’ R 3 ► R 3 

% 

such that 

F r , Si t(x,y,z) = (rx,sy,tz). 

We view this as a mixed dilation. If a rectangular box has sides of lengths 
a, b, c, then under F r<s<t it gets transformed into a box with sides ra, sb, tc 
whose volume is 

rasbtc = rstabc. 

Thus the volume gets multiplied by rst. 

If we approximate an arbitrary region in 3-spaces by cubes, then we see 
in a manner analogous to that of 2-space that the volume of the region 
changes by a factor of r 3 under dilation by r, and changes by a factor of 
rst under the mixed dilation F TiSjt . 
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Example. Find the volume of the region bounded by the equation 

X 2 y‘ 2 z 2 

—— 4- —-f- — = 1 

9 ^ 16 ^ 25 


To do this, let 


x 

u = r 




z 

w ~5 


The inequality 

u 2 + v 2 + w 2 ^ 1 

defines the unit ball in R 3 , and our given region is obtained from this unit 
ball by the mixed dilation 

^ 3 , 4 , 5 - 

Assuming that the volume of the unit ball in R 3 is equal to § w, we conclude 
that the volume of our region is equal to 

3 • 4 • 5 • §tt = 80 tt. 

In the next section, we investigate how area and volume change under 
general linear maps, not just dilations and mixed dilations. 

Exercises 

1. Find the area of the region bounded by the ellipse 

2 2 

+ Z,2 

2. Find the volume of the region bounded by the surface 

2 2 2 

— = L 

cfi ' ' c2 

In both exercises, a, b, c are positive numbers. Use the ideas of this section. 

3. Let A be the region in 3-space defined by the inequalities 

0 ^ Xi and + 4 + 4 ^ 1- 

Let C be the volume of this region. 

(a) In terms of C, what is the volume of the region defined by the inequalities 

0 ^ Xi and x\ + xt + xt ^ 29? 

(b) Same question if instead of 29 on the right you have a positive number r. 
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4. Let A be the region in 3-space defined by the inequalities 

3 

0 ^ jCj and ^ = 1. 

i=i 

Let C be the volume of this region. 

(a) In terms of C, what is the volume of the region defined by the inequalities 

3 

0 ^ Xi and £ jc? ^ 33? 

1 = 1 

(b) Same question if instead of 33 you have an arbitrary positive number r 
on the right. 

§3. Change of variables formula in two dimensions 

Let R be a rectangle in R 2 and suppose that R is contained in some 
open set U. -Let 

G:U-+ R 2 

be a Carnap. If G has two coordinate functions, 

G(u, v ) = (gfiu, v), g 2 (u, v)), 

this means that the partial derivatives of g u g 2 exist and are continuous. 
We let G(u, v ) = (x, >>), so that 

X = gi(u, v) and y = g 2 (u, v). 

Then the Jacobian determinant of the map G is by definition 

dgi dgi 

. , v du dv 

A G (u, v) = 

dg2 dg 2 
du dv 

This determinant is nothing but the determinant of the linear map G'(u, u), 
which is the tangent linear map to G at (u, v). 

Theorem 3. Assume that G is C 1 -invertible on the interior of the rec¬ 
tangle R. Let f be a function on G(R) which is continuous except on a 
finite number of smooth curves. Then 

I (/oG)|A 0 | = f f 
JR JG{R ) 

or in terms of coordinates , 

JIf(G(u, u))|A g (h, v)\ du dv = jjf(x, y ) dy dx. 

R G(R) 
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The proof of Theorem 3 is not easy to establish rigorously, depending 
on € and 8 arguments. However, we can make it plausible in view of 
Theorem 2. 


v y 



Figure 11 

Indeed, suppose first that / is a constant function, say f(x, y) — 1 for 
all (x, y ). Then the integral on the right, over G(R), is simply the area of 
G(R), and our formula reduces to 

/ |A 0 | = f 1. 

As we pointed out before, A g is the determinant of the approximating 
linear map to G. If G is itself linear, then G'(u, v) = G for all u, v and in 
this case, our formula reduces to Theorem 2, or rather its corollary. In 
the general case, one has to show that when one approximates G by its 
tangent linear map, which depends on («, v), and then integrates |A g | 
one still obtains the same result. Cf., for instance, my Introduction to 
Analysis for a complete proof. A special case will be proved in the next 
chapter. 

When/ is not a constant function, one still has the problem of reducing 
this case to the case of constant functions. This is done by taking a parti¬ 
tion of R into small rectangles S, and then approximating / on each 
G(S) by a constant function. Again, the details are out of the bounds of 
this book. 

We shall now see how we recover the integral in terms of polar coor¬ 
dinates from the general Theorem 3. 


Example 1. Let x = r cos 9 and y = r sin 9, r ^ 0. Then in this 
case, we have computed previously the determinant, which is 

A G {r, 6) = r. 


Thus we find again the formula 

JI fir cos 9, r sin 6)r dr dd 


R 


= JJ /(*> y) dy dx. 

G(R) 


Of course, we have to take a rectangle for which the map 

G(r, 9) = (r cos 9, r sin 9) 
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is invertible on the interior of the rectangle. For instance, we can take 

0 ^ ri ^ r ^ r 2 and 0 ^ d\ ^ 0 ^ d 2 ^ 2 tt. 

The image of the rectangle R is the portion G(R) of the sector as shown 
in Fig. 12. 

r y 



For the next example, we observe that if G is a linear map L, represented 
by a matrix M, then a Jacobian matrix of G is equal to this matrix M, and 
hence its Jacobian determinant is the determinant of M. 

Example 2. Let T be the triangle whose vertices are (1, 2), (3, — 1), 
and (0, 0). Find the area of this triangle 




(1,2 )=A Figure 13 


( 3 , -1) = B 


The triangle T is the image of the triangle spanned by 0, E\, E 2 under a 
linear map, namely the linear map L such that 

Wi) = (1, 2) 
and 

L(E 2 ) = (3, -1). 

It is verified at once that |Det ( L)\ = 7. Since the area of the triangle 
spanned by 0, E\, E 2 is j, it follows that the desired area is equal to 

Example 3. Let (x, y ) = G(u, v ) = ( e u cos v, e u sin v). Let R be the 
rectangle in the («, u)-space defined by the inequalities 0 ^ w ^ 1 and 
0 ^ v ^ 7r. It is not difficult to show that G satisfies the hypotheses of 
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Theorem 3, but we shall assume this. The Jacobian matrix of G is given 
by 

( U U • \ 

e cosv — e siniA 
e sin v e cos v/ 

so that its Jacobian determinant is equal to 

Aq(u, v) = e 2u . 

Let f(x, y ) = x 2 . Then f*(u, v) = e 2u cos 2 v. According to Theorem 3, 
the integral of / over G(R) is given by the integral 



which can be evaluated very simply by integrating e 4u with respect to u 
and cos 2 v with respect to v, and taking the product. The final answer is 
then equal to 


e 


4 



+ 


b t- 


Example 4. Let S be the region enclosed by ellipse defined by the 
equation 

Its area is nab. (Why?) Let L be the linear map represented by the matrix 



Its determinant is equal to —11. Hence the area of the image of S under 
L is 1 \nab. 


Exercises 

In the following exercises, you may assume that the map G satisfies the hypotheses 
of Theorem 3. 

1. Let (x, y) = G(u,v ) = ( u 2 — v 2 , 2uv). Let A be the region defined by 
u 2 + v 2 ^ 1 and 0 ^ u, 0 ^ v. Find the integral of the function 

fix, y) = l/(x 2 +y 2 >i /2 

over G(A). 

2. (a) Let (x, y) = G(u, v ) be the same map as in Exercise 1. Let A be the 

square 0 ^ u ^ 2 and 0 ^ v ^ 2. Find the area of G(A). 

(b) Find the integral of fix, y) = x over GiA). 
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3. (a) Let R be the rectangle whose corners are (1, 2), (1, 5), (3, 2), and (3, 5). 
Let G be the linear map represented by the matrix 



Find the area of G(R). 

(3 2\ 

(b) Same question if G is represented by the matrix I I • 

\l -6/ 

4. Let (x, y) = G(u, v) = (u + v, u 2 — v). Let A be the region in the first 
quadrant bounded by the axes and the line u -f v = 2. Find the integral 
of the function /(x, y) = l/\/l + 4x + 4y over G(A). 

5. Let R be the unit square in the («, u)-plane, defined by the inequalities 

0 ^ ^ 1 and 0 £ u £ 1. 

(a) Sketch the image F(R) of R under the mapping F such that 

F(u, v ) = («,« + v 2 ). 

In other words, x = u and y = u + v 2 . 

(b) Compute the integral of the function /(x, y) = x over the region 
F(R) by using the change of variables formula. 

6. Compute the area enclosed by the ellipse, defined by 

2 2 
^ &2 - • 

Take a, b > 0. 

7. Let (x, y) = G(u, v) = (u , u(l -F « 2 )). Let R be the rectangle 0 ^ u ^ 3 
and 0 ^ v ^ 2. Find the integral of f(x, y) = x over G(7?). 

8. Let G be the linear map represented by the matrix 



If A is the interior of a circle of radius 10, what is the area of G(A)7 

9. Let G be the linear map of Exercise 8, and let A be the ellipse defined as in 
Exercise 6. What is the area of G(A) ? 

10. Let T be the triangle bounded by the x-axis, the y-axis, and the line 
x + y = 1. Let tp be a continuous function of one variable on the interval 
[0, 1]. Let m, n be positive integers. Show that 

JJ<p(x + y)x m y n dy dx = c m , n j ^ (p(t)t m+n+1 dt, 

T 

where c m , n is the constant given by the integral (1 — t) m t n dt. [Hint: 
Let x = u — v and y = v.] 
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11. Let B be the region bounded by the ellipse x 2 /a 2 + y 2 /b 2 = 1. Find the 
integral 

r r 

y dy dx. 


IJ> 


%4. Change of variables formula in three dimensions 

The formula has the same shape as in two dimensions, namely: 

Change of Variables Formula. Let A be a bounded region in R 3 whose 
boundary consists of finite number of smooth surfaces. Let A be contained 
in some open set U, and let 


G: l/-> R 3 

be a C l -map, which we assume to be C 1 -invertible on the interior of A. 
Let f be a function on G(A), bounded and continuous except on a finite 
number of smooth surfaces. Then 


IJJ f(G(u, v, w)) \A g (u, v, w)\dudvdw = JJj f(x, y, z) dz dy dx. 

A G(A) 


In the 3-dimensional case, the Jacobian matrix of G at every point is then 
a 3 X 3 matrix. 

Example. Let R be the 3-dimensional rectangle spanned by the three 
unit vectors E\, E 2 , E 3 . Let A 1 , A 2 , A 3 be three vectors in 3-space, and let 

G: R 3 —> R 3 

be the linear map such that (/(/fi) = A t -. Then G(R) is a parallelotope 
(not necessarily rectangular). (Cf. Fig. 14.) 



Figure 14 
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The Jacobian matrix of the map is constant, and is equal to the deter¬ 
minant of the matrix representing the linear map. 

The volume of the unit cube is equal to 1. Hence the volume of G(R ) 
is equal to |Det(G)|. 

For instance, if 

A i = (3, 1,2), 

A 2 = (1, -1,4), 

A 3 = ( 2 , 1,0), 


then 


Det(G) = 


3 

1 

2 


1 

-1 

1 


2 

4 

0 


-4 


so the volume of G(R ) is equal to 4. 

Similarly, we find the volume of the tetrahedron spanned by the origin 
and the three vectors 


A i — (3, 1, 4), A 2 = (-1, 2, 1), As = (5, -2, 1). 

We assume that you have computed the volume of the tetrahedron 
spanned by the unit vectors, and found There is a unique linear map L 
which carries Ei on A t -. Hence the volume of our tetrahedron is equal to ^ 
times the absolute value of the determinant of this linear map, that is to f 
times the absolute value of the determinant 


3 1 

-1 2 
5 -2 


4 

1 

1 


-14. 


The answer is 14/6. 

If we are given the four vertices of a tetrahedron and want to find its 
volume, then we subtract one vertex from the others. This gives us a 
tetrahedron with one vertex at the origin, whose volume can be found by 
the above procedure. 

Example. Consider the cylindrical coordinates map, given by 

G(r, 8, z ) = (r cos 8, r sin 8, z). 

Compute its Jacobian matrix, and its Jacobian determinant. You will 
easily find 

A G (r, 8, z) = r , 

so that the general formula for changing variables gives you the same 
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result that was found in Chapter XII by looking at the volume of an 
elementary region, image of a box under the map G. 

Example. Let G be the map of spherical coordinates, given by 

G(p, 9, <p ) = (p sin <p cos 9, p sin <p sin 9, cos p). 

Again you should compute the Jacobian matrix and the Jacobian deter¬ 
minant. You will find: 


Ao(p, 9, <p) = p 2 sin <p. 

This gives a justification for the formula of Chapter XII in terms of the 
change of variables formula, which in the present case reads just like the 
result of Chapter XII, namely: 


Iff f( G (P > <p) )p 2 sin <pdpd<pd9 = JJJ f(x, y, z) dz dy dx. 

A G(A) 


Exercise. Carry out in detail the computation of the preceding two 
examples. 


Exercises 

1. (a) Let G: R 3 -» R 3 be the map which sends spherical coordinates (6, <p, p ) 

into cylindrical coordinates ( 9 , r, z). Write down the Jacobian matrix 
for this map, and its Jacobian determinant. 

(b) Write down the change of variables formula for this case. 

2. Let A be a region in R 3 and assume that its volume is equal to k. Let 

G: R 3 —> R 3 be the map such that G(x, y, z) = (ax, by, cz), where a, b, c 

are positive numbers. What is the volume of GOO? 

3. Find the volume of the ellipsoid 

2 2 2 
x y z 

-b -- H-< 1 

a2 ^ b* C 2 - 

by the change of variables formula, and by the method of dilations. 

4. Find the volume of the solid which is the image of a ball of radius a under 
the linear map represented by the matrix 



0 0 


l 
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5. (a) Find the volume of the tetrahedron T determined by the inequalities 

0 ^ x, 0 ^ y, 0 ^ z and x + y + z 5S 1. 

(b) This tetrahedron can also be written in the form 

t\E\ + t‘zE 2 + tsEs with t\ + + tz ^ 1, 0 ^ r,. 

If L is the linear map such that L(E t ) = A u show that L(T) is described 
by similar inequalities. We call it the tetrahedron spanned by 0, A i, A 2 , 
A 3. 

(c) Determine the volume of the tetrahedron spanned by the origin and 
the three vectors (1, 1, 2), (2, 0, —1), (3, 1, 2). 

(d) Using the fact that the volume of a region does not change under 
translation, determine the volume of the tetrahedron spanned by the 
four points (1, 1, 1), (2, 2, 3), (3, 1,0), and (4, 2, 3). 

6. (a) Determine the volume of the tetrahedron spanned by the four points 

(2,1,0), (3,-1,1), (-1,1,2), (0, 0,1). 

(b) Same question for the four points (3, 1,2), (2, 0, 0), (4, 1, 5), (5, — 1, 1). 




CHAPTER XIV 


Green’s Theorem 


§1. Statement of the theorem 


In this chapter, we shall change slightly our notation concerning curve 
integrals. 

Suppose we are given a vector field on some open set U in the plane. 
Then this vector field has two components, i.e. we can write 

F(x, y ) = (P(x, y ), Q(x, y)), 

where P, Q are functions of two variables ( x, y). In everything that fol¬ 
lows, we assume that all functions we deal with are C 1 , i.e. that these 
functions have continuous partial derivatives. 

Let C: [a, b] —> U be a curve. We shall use a new notation for the 
integral of F over C, namely we write 

( F= f i f(c(0) ■ C (/) dt = f P(x, y)dx+ Q(x, y) dy 
Jc Ja Jc 

or abbreviate this as 

f Pdx + Qdy. 

Jc 


This is reasonable since the curve gives 

x — x(t) 


and 

as functions of /, and 


y = ;K0 


dC 


dx 


F(C«)).Z£-Ptx,y)Z+(Kx,y)% 


dy 


dt 


dt 


Green's Theorem. Let P, Q be C 1 -functions on a region A, which is 
the interior of a closed piecewise C 1 -path C, parametrized counterclock¬ 
wise. Then 


f c Pdx+Qdy= ff (^- d £j d ydx. 

A 
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The region and its boundary may look as follows: 


Figure 1 

It is difficult to prove Green’s theorem in general, partly because it is 
difficult to make rigorous the notion of “interior” of a path, and also the 
notion of counterclockwise. In practice, for any specifically given region, 
it is always easy, however. That it may be difficult in general is already 
suggested by drawing a somewhat less simple region as follows: 




Figure 2 


We shall therefore prove Green’s theorem only in special cases, where 
we can give the region and the parametrization of its boundary explicitly. 

Case 1. Suppose that the region A is given by the inequalities 


a ^ x S b and giO) ^ y ^ g 2 O) 


in the same manner as we studied before in Chapter XII, §2. 



b 


Figure 3 
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The boundary of A then consists of four pieces, the two vertical segments, 
and the pieces parametrized by the maps: 

Ti :t*-+ a ^ t <: b, 

v 2 : t (t,g 2 (t)), a ^ t g b. 


Then we can prove one-half of Green’s theorem, namely 


L pdx =II 


dP 


p dx = f I - —dydx. 


Proof. We have 


// 


dP , , 

-r-dydx = 
dy 


~b ra 2 Lx) 


'a J Qy(x) 


D 2 P(x, y) dy dx 





P{x, y)l c/a 


( [ p (x, g 2 (x)) - p(x, gi(x))] dx 

J a 



P dx. 


However, the boundary of A, oriented counterclockwise, consists of four 
pieces, 

7i, 7 3 , 7 4 , 


where 7 2 is the opposite curve to 7 2 , and 7 3 , 7 4 are the vertical segments. 
One sees at once that the integrals 

f P dx and [ P dx 

Jy* 

are equal to 0, and thus we obtain the formula in this case. 

Case 2. Suppose that the region is given by similar inequalities as in 
Case 1, but with respect to the y-axis. In other words, the region A is 
defined by inequalities 

c ^ y ^ d and gi(y) g a < g 2 (y). 


Then we prove the other half of Green’s theorem, namely 

//*?**■ le*- 

A 
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Proof. We take the integral with respect to x first: 


// If dx dy = 


m 02^y) 1 

D\Q(x t y)dx\dy 

i(y) J 

j [ Q(g 2 (y),y ) - Q(g\(y\y)]dy. 


In this case, the integral of Q dy over the horizontal segments is equal to 
0, and hence our formula is proved. 



Figure 4 


In particular, if a region is of a type satisfying both the preceding 
conditions, then the full theorem follows. Examples of such regions are 
rectangles and triangles and interiors of circles: 



Figure 5 


We have therefore proved Green’s theorem in these cases. 

Frequently, a region can be decomposed into regions of the preceding 
types. We draw a picture to illustrate this, namely the annulus lying be¬ 
tween two circles. 



^ — I Figure 6 

By drawing four line segments as shown, we decompose this annulus 
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into four regions, and it would thus suffice to prove Green’s theorem for 
each one of these four regions. None of them yet satisfies the desired 
hypotheses, but one more decomposition will do for each region, as shown 
in the next picture. 



Figure 7 


Consequently if we denote by Ci the outside circle taken counterclock¬ 
wise, and by C 2 the inside circle taken counterclockwise, if we let 
C = {Ci, CJ}, and if A is the region between C x and C 2 , then 



dy dx = / P dx + Qdy 


P dx + Q dy — / P dx Q dy. 

Jc , 


Example 1. Let A be the region between two concentric circles C l5 
C 2 as shown, both with counterclockwise orientation (Fig. 8). 



Figure 8 


Let F = (P, Q ) be a vector field on A and suppose that 

dP _ dQ 
dy dx 


Then the left-hand side in the above relation is equal to 0, and conse¬ 
quently we see that the integral of F over C\ is equal to the integral of F 
over C 2 , in other words 


f P dx 

Ci 


+ Qdy = f Pdx + Qdy. 
Jc 2 


Of course, if F is the gradient of a function, then both these integrals 
are 0. However, we saw previously that there exist vector fields satis- 
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fying the condition dP/dy = dQ/dx, but not having potential func¬ 
tions, e.g. 


F(x, y) = ( 


-y 

x 2 -f y 2 


x y 

x 2 + y 2 J 


Example 2. Find the integral of the vector field 

y) = (y + 3x, 2 y - x ) 

counterclockwise around the ellipse 4x 2 + y 2 = 4. 

Let P(x, y) = y + 3x and Q(x, y) —2y — x. Then dQ/dx = — 1 
and dP/dy = — 1. By Green’s theorem, we get 



+ Qdy= JJ (-2 )dydx= -2 Area (A), 

A 


where Area (A) is the area of the ellipse, which is known to be 
hr (= nab when the ellipse is in the form x 2 /a 2 + y 2 /b 2 = 1). 


Exercises 

1. Use Green’s theorem to find the integral f c y 2 dx + xdy when C is the 
following curve (taken counterclockwise). 

(a) The square with vertices (0, 0), (2, 0), (2, 2), (0, 2). 

(b) The square with vertices (±1, ±1). 

(c) The circle of radius 2 centered at the origin. 

(d) The circle of radius 1 centered at the origin. 

(e) The square with vertices (±2, 0), (0, ±2). 

(f) The ellipse x 2 /a 2 + y 2 /b 2 = 1. 

2. Let A be a region, which is the interior of a closed curve C oriented counter¬ 
clockwise. Show that the area of A is given by 

Area (A) = \ J —y dx + x dy = J xdy. 

3. Let Ci be the closed path consisting of the vertical segment on the line 
x = 2, and the piece of the parabola 

y 2 = 2(x + 2) 

lying to the left of this segment, as shown on Fig. 9. We assume that Ci 
is oriented counterclockwise. Find the integral 


f ~ ~ y 

Jci x 2 + y 2 


dx + 


x 2 



[Hint: Reduce this to an integral over the circle of radius 1.] 
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Figure 9 


4. Assume that the function / satisfies Laplace’s equation, 

tL+h =(> 

dy 2 

on a region A which is the interior of a curve C, oriented counterclockwise. 
Show that r 

L%« -’d* - 


5. If F = ( P, Q) is a vector field, we recall that its divergence is defined to be 
div F = dP/dx + d Q/dy. Let 

C(t) = (*i(0, *2(0), a = t ^ b 

be a closed curve. Define the right normal vector at t to be the vector 


m = (*2(o, -*i(o). 


Verify that this is a vector perpendicular to the curve. Show that if F is a 
vector field on a region A, which is the interior of the closed curve C, 
oriented counterclockwise, then 


JJ (div F) dy 


A 


dx 



F• Ndt 


Note that ||N(0ll = ||C'(0|| = v(t). Since s(t ) = fv(t)dt, the integral 
on the right is often expressed in terms of s. Let n = 2V/||2V|| be the unit 
vector in the direction of N. Then our formula reads: 


JI (div F)dy dx — J F ■ n ds. 

A 

6. Let C: [a, b] —► U be a C 1 -curve in an open set U of the plane. If / is a 
function on U (assumed to be differentiable as needed), we define 

/ /= f /(C(/))||C'(/)|| dt 
JC Ja 


For r > 0, let x = r cos 0 and y = r sin 6. Let <p be the function of r 
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defined by 


* r) = in L, f = in l /<r COS "■ ' 6> M - 


where C T is the circle of radius r, parametrized as above. Assume that 
/ satisfies Laplace’s equation 

o 

3x2 ^ dy2 

Show that <p(r) does not depend on r and in fact 


mo) = irrl/- 


[Hint: First take <p'(r) and differentiate under the integral, with respect to r. 
Let D r be the disc of radius r which is the interior of C r . Using Exercise 5, 
you will find that 


/(r) 


1 

2irr 



div grad / ( x , y) dy dx 



= 0 . 

Taking the limit as r —► 0, prove the desired assertion.] 


§2. Application to the change of variables formula 

When a region A is the interior of a closed path, then we can use 
Green’s theorem to prove the change of variables formula in special cases. 
Indeed, Green’s theorem reduces a double integral to an integral over a 
curve, and change of variables formulas for curves are easier to establish 
than for 2-dimensional areas. Thus we begin by looking at a special case 
of a change of variables formula for curves. 

Let C: [a, b] —> U be a C^-curve in an open set of R 2 . Let G: U —> R 2 
be a C 2 -map, given by coordinate functions, 

G(u, v ) = O, y ) = (/(«, v), g(u, v)). 

Then the composite G° C is a curve. If C(t ) = (a(t), /3(0)> then 

Goc(t) = g(c{ o) = mi my 

Example 1. Let G(u, v) — ( u , —v) be the reflection along the horizontal 
axis. If C(t) = (cos t, sin /), then 

Go C(t ) = (cos t, —sin t). 
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Thus G°C again parametrizes the circle, but observe that the orientation 
of Go C is opposite to that of C, i.e. it is clockwise! 



G 


Figure 10 

The reason for this reversal of orientation is that the Jacobian deter¬ 
minant of G is negative, namely it is the determinant of 





Thus a map G is said to preserve orientation if A G (u, v) > 0 for all (u, v) 
in the domain of definition of G. For simplicity, we only consider such 
maps G. 

Green’s theorem leads us to consider the integral 



By definition and the chain rule, we have 


x 


' G°C 


*-/>«■ »(£5+£S)* 


= J c f(u ' v) ¥u du + f(u ' v)d /v dv - 


This is true for any C^-curve as above. Hence it remains true for any 
piecewise C ! -path, consisting of a finite number of curves. 

We are now ready to state and prove the change of variables formula 
in the case to which Green’s theorem applies. 

Let U be open in R 2 , and let A be a region which is the interior of a 
closed path C (piecewise C 1 as usual) contained in U. Let 

G: G —> R 2 

be a C 2 -map, which is C 1 -invertible on U and such that Aq > 0. Then 
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G(A) is a region which is the interior of the path G°C. We then have 




Figure 11 


Proof. Let G(u, v ) = (/(«, v), g(u, u)) be expressed by its coordinates. 
We have, using Green’s theorem: 


II 

au) 


dydX = Lc Xdy= ic f ¥u 


du + dv 
J dv 


= II[r u ( f %) ~ i( fa M dudv 

A 

[f \<f[ dg , f dg _ f dg _ df dg 
J J dv J du dV J du dv dv du 


dwdv 


df dg _ dg df 
du dv du dv 


du dv 




) du dv, 


thus proving what we wanted. 


Exercises 

1. Under the same assumptions as the theorem in this section, assume that 
(p - <p(x,y) is a continuous function on G(A), and that we can write 
<?(■*> T) — dQ/dx for some continuous function Q. Prove the more general 
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formula 


Jf <p(x, y ) dy dx = 

GU) 


JJ<p(G(u, v))A G (u, v) 

A 


du dv. 


[Hint: Let P = 0 and follow the same pattern of proof as in the text.] 

2. Let (x, y) = G(u, u) as in the text. We suppose that G: U —> R 2 , and that 
F is a vector field on G(U). Then F°G is a vector field on U. Let C be a 
C 1 -curve in U. Show that 


/ F = / (Fo G) ■ ~ du + (Fo G) • ^ dv. 

Jg°c jc dv 

[Let F(x, y ) = (P( x, y), Q(x, y)) and apply the definitions.] 




CHAPTER XV 


Surface Integrals 


§1. Parametrization, tangent plane, 
and normal vector 

Let us first recall that a curve can be described by an equation, like 

x 2 + y 2 = 1, 

or it can be given parametrically, as when we set 

x = cos 6, 
y = sin 9, 

with 0 ^ 6 ^ 2tt. A similar situation will occur for surfaces, and we 
consider first the parametric representation. 

Let R be a region in the plane, whose variables are denoted by (/, u ). Let 

X: R -> R 3 

be a mapping, which can be written in terms of its coordinate functions 

X(t, u ) = (*!(r, u), x 2 (t, u), x s (t, u)), 

where Xi, x 2 , x 3 are functions from R into the real numbers. We say that 
such a mapping is C 1 if each coordinate function is differentiable, and if 
its partial derivatives are continuous. In this case, we may view X as 
parametrizing a surface in R 3 , as shown on Fig. 1. 



b 



y 

Figure 1 
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If x, y, z are the three coordinates of R 3 , then we also write the para- 
metrization of our surface in the form 


x =/lO, V) 

or 

x(u, v), 

y = / 2 («, v) 

or 

y(u, »), 

z - f 3 (u, V ) 

or 

z(u, v). 


Example. We parametrize the sphere of radius p by means of spherical 
coordinates, as studied in Chapter XII, namely 

x = p sin cos 9, 
y = p sin ^ sin 9, 

Z = p COS <p. 

The region R in R 2 is the rectangle described by the inequalities 


and 


0 ^ <p ^ 7T 

0 g 9 < 2 tt. 


Our mapping “wraps” this rectangle around the sphere. If we evaluate 

x 2 + y 2 + z 2 , I 

and use relations like sin 2 6 + cos 2 0 = 1, we get the value p 2 . This kind 
of technique shows us how to get back the equation in rectangular 
coordinates from the parametrization. 


Example. A torus (i.e. a doughnut-shaped surface) can be given 
parametrically by the functions: 

x — (a b cos i p) cos d, 
y = (a + b cos <p) sin 6, 
z = b sin c p. 

The torus is centered at the origin, and a > 0 is the distance from the 
origin to the center of a cross section, as shown on Fig. 2. The variables 
<p, 9 satisfy the inequalities 


and 


0 S <P < 27T 


0 g 9 < 2tt. 
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z 


y 



Figure 2 


The number b > 0 is the radius of a cross section. The angle <p determines 
the rotation of a point in this cross section, as shown in Fig. 3. 


z 



Figure 3 


It is clear from this picture that the elevation z of a point is given by b sin <p. 
If we project the point on the (x, j>)-plane, then the distance of this pro¬ 
jection from the origin is exactly 

a + b cos <p. 

To get the x-coordinate of this projection, we have to multiply the pro¬ 
jection with cos 0, and to get the ^-coordinate of this projection, we have 
to multiply the projection with sin 0, as shown on Fig. 4. 



308 


SURFACE INTEGRALS 


[XV, §1] 


Let R be a region in R 2 , and let 

X: R-> R 3 

be the parametrization of a surface. We assume that X is of class C 1 . 
We have already studied the derivative of X at a point, and in Chapter XI, 
we defined it as the tangent linear map. If 


f x\(t, «) N 

X(t, u) = [ x 2 0, u ) 

K Xz(t, u) / 

is represented by coordinates, then the derivative 

X'(t, w): R 2 -* R 3 

is a linear map, represented by the Jacobian matrix 



For simplicity, we shall express ourselves as if this Jacobian matrix were 
actually a linear map. If we apply it to the two unit (vertical) vectors 



then we obtain two vectors A 1 , A 2 which are nothing but 

A' = J x (t, »)£■' = ~ and A 2 = J x (j, u)E 2 = ~ . 


viewing X(t, u) as a vertical vector. The picture is as follows. 
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We shall say that (t, u ) is a regular point for X if the two vectors A \ A 2 
span a plane in R 3 . The translation of this plane to the point X(t, u ) is 
called the tangent plane of the surface at the given point. This is illustrated 
on Fig. 6. It is the plane passing through the point X(t, u ), parallel to 
the vectors A 1 = dX/dt and A 2 = dX/du. 


Tangent plane 



Figure 6 


We now assume that you have read the section on the cross product in 
Chapter I. Then you realize that if A, B are non-zero vectors in R 3 , and 
are not parallel, their cross product 

A X B - (a 2 & 3 - a z b 2 , a 3 bx - a x b 3 , a x b 2 - a 2 b 0 

is perpendicular to both of them, as illustrated on Fig. 7. 



Figure 7 
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If we want a vector of norm 1 perpendicular to both A and B, all we have 
to do is divide A X B by its norm. 

In the case of a parametrized surface, we can do this with the two 
vectors A 1 and A 2 as above. Of course, B X A = — ^ X 5 is also 
perpendicular to both A and B. We use the notation 


N - 


OX dX 
dt x du 


whenever the surface is given parametrically by ( t, u ) i—► X(t, u). Then 
N = N(t, u ) is a vector perpendicular tQJdie surface, as shown on Fig. 8. 



Figure 8 


If we have chosen the orientation, i.e. the order of t, u, such that N 
points outwards from the surface, and if we denote by n the outward unit 
normal vector to the surface, then we have 


N 

~ M 


dX dX 
dt x du 
dX dX 
dt x du 


Example. We compute the above quantities in the case of the para- 
metrization of the sphere given above. We get very easily 


dX 

d<p 


'p cos tp cos 6 
p cos tp sin d 
—p sin <p 


Hence 


... . dX dX 


and 


dX 

de 


(p 2 sin 2 cos O' 
p 2 sin 2 <p sin d 
p 2 sin tp cos tp 


-p sin <p sin d 
p sin tp cos d 
0 


= p sin <p X(tp, d). 
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Since sin <p and p are ^ 0, we see that N has the same direction as the 
position vector X(<p, 9), and therefore points outward. Taking the square 
root of the sum of the squares of the coordinates, we find 

dJX dJC 
d<p X d9 

Hence 

n = i X( v , e). 

Exercises 

1. Compute the coordinates of the vectors dX/dd and dX/dy, when X is the 
mapping parametrizing the torus as in Example 2. Compute the norms of 
these vectors. 

In each one of the following exercises, where you are given a parametrization 

dX dX 

X(t, u), compute the tangent vectors — , — , their cross product, and the norm 

at ou 

of this cross product. In each case, get an equation in cartesian coordinates for 
the surface parametrized by X. Draw the picture of the surface. 

2. The cone. Let a be a fixed number, 0 < a < x/2. Let 

X(6, z ) = (z sin a cos 9, z sin a sin 9, z cos a), 

0 ^ 9 < 27t and 0 ^ z ^ h. Describe how you get a cone of height h. 

3. Paraboloid. Let X(t, 9) = (at cos 9, at sin 9, t 2 ), with 

0 ^ 9 < 2tt and 0 ^ t ^ h. 

4. Ellipsoid. Let a, b, c > 0. Let 

X(cp, 9) = (a sin <p cos 9, b sin <p sin 9, c cos <p). 

5. Cylinder. Let a > 0. Let 

X(9, z ) = (a cos 9, a sin 9 , z), 
with 0 S <P < 2-jt, and h\ S z S 

6. Surface of revolution, (around the z-axis). Let/be a function of one variable 
r, defined for r\ ^ r ^ r?,. Let 0 ^ <p < 27t, and let 

X(r, 6 ) = (r cos 9, r sin 9, f ( r )). 
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§2. Surface area 

Let A, B be non-zero vectors in R 3 , and assume that they are not parallel. 
Then they span a parallelogram, as shown on Fig. 9, and this parallelogram 
is contained in a plane. 


z 


V 



Figure 9 


If d is the angle between A and B, then the area of this parallelogram is 
precisely equal to 

Nil Nil I s in el 

as one sees at once from Fig. 10, and as we already mentioned in Chapter I. 



We observe that ||v4|| ||J5|| |sin 0| is precisely the norm of A X B. Thus in 
3-space, we may say that the area of the parallelogram spanned by A and 
B is equal to 

\\A X B\\. 

We apply this to surfaces. At each point, the tangent linear map 
X'(t, u) of a parametrizing map X(t, u) transforms the unit square spanned 
by E 1 , E 2 into a parallelogram spanned by 

dX , dX 


We can view this transformation as the local stretching effect on the area 
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of the square, and by the preceding remark, the area of this parallelogram 
is equal to 


dX dX 
dt du 



Assume that X is defined on a region R, and that the mapping 


(/, u ) i—> X(t, u ) 

is injective, except for a finite number of smooth curves in R. Also 
assume that the coordinate functions of/ are C 1 , and that all the points 
of R are regular, except for a finite number of smooth curves. It is then 
reasonable to define the area of the parametrized surface to be the integral 



Example. Let us compute the area of a sphere, whose parametrization 
was given in §1. We had already computed that 


dX cLX 

dtp dd 



sin tp. 


Hence the area of the sphere is equal to 



sin <p dtp dd. 


Since p 2 is constant, we take it out of the integral. It is a trivial matter to 
carry out the integration, and we find that the desired area is equal to 
4irp 2 . 
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Example. Sometimes a surface is given by the graph of a function 

^ = /(*, y ), 

defined over some region R of the (x, >>)-plane. In this case, we use t = x 
and u = y as the parameters, so that 

X(x, y) = (*, y, fix, >0). 

Thus the case when a surface is so defined is a special case of the general 
parametrization. In this special case, we find 



Consequently 



The area of the surface is given by the integral 



X 


Figure 12 
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Symbolically we may write in this case 


1 7d 


A 2 

**= V'+O 

Q + 0 

y) dxd y • 


Example. 
an equation 


It may also happens that a surface is defined implicitly by 


g(x, y, z) = 0, 


and that over a certain region R of the ( x, y)-plane, we can then solve for 
z by a function 

2 = fix, j), 


satisfying this equation, that is 

g(x,y,f(x,y)) = 0. 

Taking the partials with respect to x and y, we find the relations: 


df _ dg/dx df = dg/dy m 

dX dg/dz dy dg/dz 

We can now use the formula for the area obtained in the preceding 
example, and thus obtain a formula for the area just in terms of the given 
g, namely: 



V (dg/dx) 2 + (dg/dy) 2 + (dg/dz) 2 dx 

\dg/8z\ y 


Example. Take the special case of this formula arising from the equation 
of a sphere 

x 2 + y 2 + z 2 — a 2 = 0, 

where a > 0 is the radius. Then g(x, y, z) is the expression on the left, 
and the partials are trivially computed: 


I 1 = 2jc ’ 

dx 


dg 

dy 


= 2y, 


IT = 2z - 

dz 


We can solve for z explicitly in terms of x , y by letting 

Z = \/fl2 - X 2 - y2 = /(x, y\ 
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where (x, ranges over the points in the disc of radius a in the plane. The 
surface is then the upper hemisphere. 


z 



Figure 13 


We can again compute the area of this hemisphere by the integral 

//;**• 

R R 

using the fact that x 2 + y 2 + z 2 = a 2 . Using polar coordinates, we 
know how to evaluate this last integral. We get 


Area of hemisphere = a 


n a 

) 


l 


\/a 2 — r 2 


r dr dd. 


Integrating 1 with respect to 6 between 0 and 2i r yields 2x. The integral 
with respect to r is reducible to the form 



and is therefore easily found. Thus, finally, we obtain the value 

lira 2 


for the area of the hemisphere. Naturally, this jibes with the answer 
found from the parametrization by means of spherical coordinates. 

Remark. Just as in the case of curves, it can be shown that the area of 
a surface is independent of the parametrization selected. This amounts to 
a change of variables in a 2-dimensional integral, but we shall omit the 
proof. 


Exercises 

Compute the following areas. 

1. (a) A cone as shown on the following figure. 
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(b) The cone of height h obtained by rotating the line z = 3x around the 
z-axis. 

2. The surface z = x 2 + y 2 lying above the disc of radius 1 in the (x, y)-plane. 

3. The surface 2z = 4 — x 2 — y 2 over the disc of radius \/2 in the (jc, y)- 
plane. 

4. z = xy over the disc of radius 1. 

5. The surface given parametrically by 

X(t, 0) = (t cos 0, t sin 0, 0), 

with 0 ^ t 5s 1 and 0 ^ 0 ^ 2tt. 

6. The surface given parametrically by 

X(t, u) = (t + u, t - u, 0, 

with 0 ^ t ^ 1 and 0 ^ 0 ^ 2t. [Hint: Use t — sinh u = (e u — e~ u )/2.] 

7. The part of the sphere x 2 + y 2 + z 2 = 1 between the planes z = l/\/2 
and z = —1/\/2. 

8. The part of the sphere x 2 + y 2 + z 2 = 1 inside the cone x 2 -f y 2 = z 2 . 

9. The torus, using the parametrization in §1, assuming that the cross section 
has radius 1. 

§3. Surface integrals 
Let R be a region in the plane, and let 

X: R -> R 3 

be the parametrization of a surface by a smooth mapping X. Let S be the 
image of X, i.e. the surface, and let ^ be a function on S. Then when ^ is 
sufficiently smooth, we define the integral of \f/ over S by the formula 

S R 
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When \f/ is the constant 1, then our formula expresses simply the area of 
the parametrized surface. 

Example. Suppose that \p is the function representing a positive density 
of the surface. Then the integral above is interpreted as the mass m of 
the surface, corresponding to this density. 

Example. Let ^ be a density as above, and m the mass. The integrals 


x = 


~ jj x i(x, y, z) da, 

S 


y = m!f 


y y, z) da, 


s 


Z = 


mff Z M** y ’ ^ d<X 

s 


give the coordinates (x, y, z) of the center of mass of the surface. For 
instance, suppose that we want to find the center of mass of a hemisphere 
of radius a, having constant density c. We use the spherical coordinate 
parametrization of §1. The hemisphere is the one lying above the (x, >>)- 
plane as in Fig. 14. 



Figure 14 


By symmetry, it is easy to see that x = y = 0. We have z = a cos </?. 
The third coordinate z is given by the integral 


z = 


c_ 

m 



s 



a cos (p • a 2 sin (p d<p dd. 
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which is easily evaluated to be ca*ir/m. The total mass is equal to the 
density times the area, since the density is constant, and we know that the 
area of the hemisphere is lira 2 . Hence we find 

z = a/2. 

Example. Let X : R —> R 3 parametrize a surface, and suppose that the 
image of X, that is the surface, is contained in some open set U in R 3 . 
Let F be a vector field on U, that is a mapping 

F: U —> R 3 . 

We assume that F is as smooth as needed. We define the integral of the 
vector field along the surface in a manner similar to the integral of a vector 
field along a curve in the lower dimensional case. Namely, let n be the 
outward normal unit vector to the surface, it being assumed that we have 
agreed on an orientation of the surface which determines its outside and 
inside. Then 

F • n 


is the projection of the vector field along the normal to the surface, and we 
define the above integral by the formula 



F ■ nda 


s 



ax ax 

dt x du 


dt du. 


By definition, we have 


n 


dX dx. 
dt x du 


dX w dX 
17 x lu ' 


Hence our integral for F over the surface can be rewritten 


S R 


An important physical example is given by a fluid flow, subject to a force 
field G, so that we may interpret G as a vector field, 

G: U R 3 . 

Let \J/ be the function representing the density of the fluid, so that y, z ) 
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is the density at a given point (x, y, z), and is a number. We call 

F(x, y, z ) = +(x, y, z)G(x, y, z) 
the flux of the flow, and visualize it as in Fig. 15. 


Figure 15 

The amount of fluid passing through the surface per unit time is then given 
by the integral of the flux over the surface, namely 

F • n dcr, 

s 

where F is the flux. 

It is not true that all surfaces can be oriented so that we can define an 
outside and an inside. The well-known Moebius strip gives an example 
when this cannot be done. In all the applications that we deal with, 
however, it is geometrically clear what is meant by the inside and outside. 
It is fairly difficult to give a definition in general, and so we don’t go into 
this. 

Observe that when we give a parametrization 

(/, u) i ► X{t, u ), 

we could interchange the role of t, u as the first and second variable, 
respectively. Thus, for instance, if 

X(t, u) = (t, u, t 2 + u 2 ), 

we could let 

Y(u, t ) = (/, u, t 2 + w 2 ). 

Then 

ay ay = 

du dt dt du 
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Interchanging the variables amounts to changing the orientation. The two 
normal vectors corresponding to these two parametrizations have opposite 
direction. In finding the integral of a vector field with respect to a given 
parametrization, one must therefore agree on what is the “inside” and 
what is the “outside” of the surface, and check that the normal vector 
obtained from the cross product of the two partial derivatives points to the 
outside. 

Example. Compute the integral of the vector field 

F(x, y) = (x, y) 

over the sphere x 2 + y 2 -f z 2 = a 2 ( a > 0). We use the parametriza¬ 
tion of §1. Then 

A7Y a\ dX dX . w . 

N (<P, 9) = X 00 = a sin <p X(<p, 6). 

Thus N(<p, 6) is a positive multiple of the position vector X, and hence 
points outwards. So we get 

F(X(tp, 6 )) • N(<p, 6) = ( a sin <p)[(a sin <p cos 0) 2 -H ( a sin $ sin 0) 2 ], 
and 

r2ir 

F • N dip dd = a 3 / 

R 



f 

J 0 


sin 3 $ dy dd = 


87Ttf' 



Figure 16 


Example. Let S be the paraboloid defined by the equation 

z = x 2 + y 2 . 

We can use x, y as parameters, and represent S parametrically by 

X(x,y ) = (x,y,x 2 -f y 2 ). 
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Then 


N(x, y ) = (1, 0, 2x) X (0, 1, 2 y) 
= (~2x, —2y, 1). 


Thus with the parametrization as given, we see from the picture that N 
points inside the paraboloid. 


z 



Figure 17 


For instance, when x, y are positive, say equal to 1, then 

w,l)= (-2,-2, 1), 

which points inward. Consequently, if we want the integral of a vector 
field F with respect to the outward orientation, then we have to take 
minus the integral ^F ‘ N dx dy. To handle a concrete case, let 

F(x, y, z) = (y, -x, z 2 ). 

We want to compute the integral of F over the part of the paraboloid 
determined by the inequality 


0 ^ z ^ 1. 


We have 


F(X (x, y)) • N(x, y) = —2 xy + 2 xy + z 2 = z 2 

= (x 2 + y 2 ) 2 . 

JJ F ■ n do- = — JJ (x 2 + >> 2 ) 2 dx dy, 

S R 


Hence 
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where R is the unit disc in the (x, j>)-plane. Changing to polar coordinates, 
it is easy to evaluate this integral, which is equal to 

f f' 4 ^ = r 

J o J o 3 

The desired integral is therefore equal to —x/3. Note that in the present 
case, we have 


n = 



Exercises 

Integrate the following function over the indicated surface. 

1. The function x 2 -f y 2 over the same upper hemisphere as in the example in 
the text. 

2. The function ( x 2 + y 2 )z over this same hemisphere. 

3. The function (x 2 + y 2 )z 2 over this same hemisphere. 

4. The function z(x 2 + y 2 ) 2 over this same hemisphere. 

5. The function z over the surface 

z = 1 — x 2 — y 2 , z ^ 0 . 

(Use polar coordinates and sketch the surface.) 

6. The function x over the cone x 2 -f- y 2 = z 2 , 0 ^ z ^ a. 

7. The function x over the part of the sphere x 2 + y 2 + z 2 = a 2 contained 
inside the cone of Exercise 6. 

8. The function x 2 over the cylinder defined by x 2 + y 2 = a 2 , and 0 ^ z ^ 1, 
excluding its top and bottom. 

9. The same function x 2 over the top and bottom of the cylinder. 

10. Theorem of Pappus. Let C: [a, b] —* R 2 be the parametrization of a smooth 
curve, say 

C(t) = ( f(t ), z(0), 

which we view as lying in the (x, z)-plane, as shown on Fig. 18. We assume 
that f{t) ^ 0. Let x be the x-coordinate of the center of mass of this curve 
in the (x, z)-plane. Prove that the area of the surface of revolution of this 
curve is equal to 


IttxL, 
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z 


y 



Figure 18 


where L is the length of the curve. Hint : Parametrize the surface of revolu 
tion by the mapping 

X(t, 9 ) = ( fit ) cos 9, fit) sin 9, z(t)). 

What is 9 in Fig. 18? Recall that x is given by 


a: = 


l 

L 



f(t)\\C{t)\\dt. 


How does this apply to get the area of torus in a simple way? 

11. Let S be the center of a sphere of radius a and centered at O. Let P be a 
fixed point, either inside or outside the sphere, but not on S. Let 

riX) = \\X-P\\. 

Show that 



Aira if P is inside the sphere 


Ana 2 . 


if P is outside the sphere. 


Find the integrals of the following vector fields over the given surfaces. 

12. Fix, y, z ) = * _ iy, —y, 1) over the paraboloid 

V x 2 + y 2 

z — 1 — x 2 — y 2 , 0 fS 2 1. 


(Draw the picture.) 

13. The same vector field as in Exercise 12, over the lower hemisphere of a sphere 
centered at the origin, of radius 1. 
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14. The vector field F(x, y, z) = (y, —x, 1) over the surface 

X(t, 0 ) = (r cos 6, t sin 6, 6), 

0 ^ ^ 1 and 0 ^ d ^ 2tt. 

15. The vector field F(x, y, z) = ( x 2 , y 2 , z 2 ) over the surface 

X(t, «) = (/ + «,/- u, t), 

0 ^ t ^ 2 and 1 ^ u ^ 3. 

16. The vector field F(X) = X, over the part of the sphere x 2 + y 2 + z 2 — 1 
between the planes z = 1/V2 and z = — 1 /y/2. 

17. The vector field F(x, y, z) = (.x, 0, 0) over the part of the unit sphere inside 
the cone x 2 + y 2 = z 2 . 

18. The vector field F(x, y, z) = (x, y 2 , z) over the triangle determined by the 
plane x + y + z = 1, and the coordinate planes. 

19. The vector field F(x,y,z) — (x, y, z 2 ) over the cylinder defined by 
x 2 + y 2 = a 2 , 0 ^ z ^ 1, 

(a) excluding the top and bottom, 

(b) including the top and bottom. 

20. The vector field F(x, y, z) = (xy, y 2 , y 3 ) over the boundary of the unit cube 

0 ^ ^ 1, 0 ^ y ^ 1, O^z^l. 

21. The vector field F(x, y, z) = (xz, 0,1) over the upper hemisphere of radius 1. 


§4. Curl and divergence of a vector field 
Let U be an open set in R 3 , and let 

F: £/—> R 3 

be a vector field. Thus F associates a vector to each point of U, and F is 
given by three coordinate functions, 

F(x,y,z) = 

We assume that F is as differentiable as needed, usually of class C 1 
suffices, i.e. each coordinate function is differentiable and has continuous 
partial derivatives. 

We define the divergence of F to be the function 


div F = 


d/i i d /2 , d /3 
dx dy dz 
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Thus the divergence is the sum of the partial derivatives of the coordinate 
functions, taken with respect to the corresponding variables. 

Example. Let F(x, y, z) = (sin xy, e xz , 2x -f yz 4 ). Then 

(div F)(x, y, z) = y cos xy + 0 + 4 yz 3 
= y cos xy + 4 yz z . 

As a matter of notation, one sometimes writes symbolically 

V = (lx'ly’Q = (D " D >’ D ^' 

where D x , D 2 , D 3 are the partial derivative operators with respect to 
the corresponding variables. Then one also writes 


div F=V-F=Dif 1 + D 2 f 2 + D 3 f 3 . 


We shall interpret the divergence geometrically later. Similarly, we 
now define the curl of F, and interpret it geometrically later. We define 


curl F 


(d_h _ 

\dy 


df 2 df x df 3 df 2 


_ 

dyj 


dz dz dx dX 
= (^ 2/3 ~ F) 3 f 2 , D 3 / x — Z>i/ 3 , D\f 2 — D 2 f\). 


The curl of F is therefore also a vector field. 
Again, we use the symbolic notation 


curl F = V X F = 


E 1 

e 2 

F>3 


d 2 

F>3 

ft 

f* 

f3 


where E\, E 2 , E 3 are the standard unit vectors. The “determinant” on the 
right is to be interpreted symbolically, using an expansion according to the 
first row. For instance, the first term in such an expansion is obtained by 
taking E\ and “multiplying” it by the “determinant” 


F ) 3 
f 2 f 3 


^ 2/3 — F> 3 f 2 . 


This means that the first component of curl F is D 2 f 3 — D 3 f 2 . The 
other components are obtained by a similar formal operation on the 
“determinant”, with respect to the second and third components of 
the first row. 
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Example. Let F be the same vector field as in the preceding example. 
Then 



E i 

e 2 

£3 

curl F = 

D, 

d 2 

D* 


sin xy 

e *v 

2 x + yz 4 


= (z 4 — 0, 2 — 0, ye xy — x sin xy) 
= (z 4 , 2, ye xy — x sin xy). 


Exercises 


Compute the divergence and the curl of the following vector fields. 

L F(x,y,z) = (x 2 , xyz, yz 2 ) 

2. F(x, y, z) = (t log x, x log y, xy log z) 

3. F(x, y, z) = (x 2 , sin xy, e z yz ) 

4. F(x, y, z) = sin z, e xz sin y, e yz cos x) 

5. Let <p be a smooth function. Prove that curl grad <p = 0. 

6. Prove that div curl F = 0. 

7. Let V 2 = V • V = Z)? + D\ + Z>! = . A func¬ 

tion f is said to be harmonic if V 2 / = 0. Prove that the following functions 
are harmonic. 


(a) 1 - (b) x 2 — y 2 + 2z 

V x 2 + y 2 + z 2 

(c) If / is harmonic, prove that div grad / = 0. 


8. Let F(A^ c ||^||3 > 
curl F = 0. 


where c is constant. 


Prove that div F = 0 and that 


9. Prove that div (F X G) — G • curl F — F • curl <7, if F, G are vector fields. 
10. Prove that div (grad/ X grad g) = 0, if f,g are functions. 


§5. Divergence theorem 

In this section, we let U be a 3-dimensional region in R 3 , whose boundary 
is a closed surface which is smooth, except for a finite number of smooth 
curves. For instance, a 3-dimensional rectangular box is such a region. 
The inside of a sphere, or of an ellipsoid is such a region. The region 
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bounded by the plane z = 2, and inside the paraboloid z = x 2 + y 2 is 
such a region, illustrated in Fig. 19. 


z 


y 



Figure 19 


Note that the boundary consists of two pieces, the surface of the paraboloid 
and the disc on top, each of which can be easily parametrized. 

Divergence Theorem. Let U be a region in 3-space, forming the inside of 
a surface S which is smooth, except for a finite number of smooth curves. 
Let F be a C 1 vector field on an open set containing JJ and S. Let n be 
the unit outward normal vector to S. Then 


JJ F • n da 

s 


JJf di \FdV, 

u 


where the expression on the right is simply the triple integral of the 
function div F over the region U. 

It is not easy to give a proof of the divergence theorem in general, but 
we shall give it in a special case of a rectangular box. This makes the 
general case very plausible, because we could reduce the general case to 
the special case by the following steps: 

(i) Analyze how surface integrals change (or rather do not change) 
when we change the variables. 

(ii) Reduce the theorem to a “local one” where the region admits one 
parametrization from a rectangular box. This can be done by various 
chopping-up processes, some of which are messy, some of which are neat, 
but all of which take up a fair amount of space to establish fully. 

(iii) Combine the first and second steps, reducing the local theorem 
concerning the region to the theorem concerning a box, by means of the 
change of variables formula. 
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We now prove the theorem for a box, expressed as a product of intervals: 

[a i, b i] X [a 2 , b 2 ] X [a 3 , b s ], 
and illustrated in Fig. 20. 



The surface surrounding the box consists of six sides, so that the integral 
over S will be a sum of six integrals, each one taken over one of the sides. 
Let Si be the front face. We can parametrize Si by 

X(y, 2 ) = (a 2 , y, z), 

with y, z satisfying the inequalities 

b 1 ^ y ^ b 2 and Ci ^ z ^ c 2 . 

Let tii be the unit outward normal vector on Si. Then 


ni = (1,0,0). 


If F = (/ 1 , f 2 , / 3 ), then F • n x = /i, and hence 


// 


F • ndcr 


P ( b *Ma 2 ,y, Z )dydz. 

J Ci Jbi 


Similarly, let S 2 be the back face, parametrized by 


X(y, z) = (a u y, z), 


with y, z satisfying the same inequalities as above. Then 


n 2 = -(1,0,0), 

the geometric interpretation being that the outward unit normal vector 
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points to the back of the box drawn on Fig. 20. Hence 


JJ F ■ ndcr 
s 2 


[ C2 f bl -fi(a u y,z)dydz. 
Jc l Jb 1 


Adding the integrals over Si and S 2 yields 


ff + ff F nda = f ' 2 ff 2 [fi(a 2 , y, z) - fi(ai,y, z)] dy dz 

S 1 li 2 C1 bl 


= f 2 f 2 f 2 y, z) dx dy dz 

Jg i Jbi Ja 1 


= JJfDiAdV. 

u 

We now carry out a similar argument for the right side and the left side, 
as well as the top side and the bottom side. We find that the sums of the 
surface integral taken over these pairs of sides are equal to 

fff D 2 f 2 dV 

u 

and 

fjfD 3 f 3 dV, 

(/ 

respectively. Adding all three volume integrals yields 

ff F nd(r= fff (£>,/, + D 2 f 2 + D 3 f 3 )dV , 

S U 

which is precisely the integral of the divergence, thus proving what we 
wanted. 


Example. Let us compute the integral of the vector field 

F(x, y, z) = (x 2 , y 2 , z 2 ) 

over the unit cube by using the divergence theorem. The divergence of F 
is equal to 2 x + 2 y + 2z, and hence the integral is equal to 

f 1 C f 1 (2.x + 2 y -f- 2z) dx dy dz, 

Jo Jo Jo 


which is easily evaluated to give the value 3. 
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Example. Let us compute the integral of the vector field 


F(x, y, z) = (x, y, z), 

that is F(X) = X over the sphere of radius a. The divergence ofFis equal to 


to , §y , & = , 
dx ^ dy ^ dz 


The ball B is the inside of the sphere. By the divergence theorem, we get 

JJ F • n da = JJJ 3 dV = 3 • fir a 3 = 4?ra 3 . 


B 


Note that the volume integral over the ball B of radius a is the integral of 
the constant 3, and hence is equal to 3 times the volume of the ball. 

The divergence theorem has an interesting application, which can be 
used to interpret the divergence geometrically. 

Corollary. Let B(t ) be the solid ball of radius t > 0, centered at a point 
P in R 3 . Let S(t) denote the boundary of the ball , i.e. the sphere of radius 
t, centered at P. Let F be a C 1 vector field, and let K(t) denote the 
volume of B(t). Let n denote the unit normal vector pointing out from the 
spheres. Then 

(div F)(P) = lim jj F• nd<r. 

S(t) 

Proof. Let g = div F. Since g is continuous by assumption, we can 
write 

g(X) = g(P) + h(X), 


where 


lim h(X) = 0. 

X-*P 


Using the divergence theorem, we get 


1 

V(t ) 


II 


F- n da 


B(t) 

4 ; /// 

B(t) 


div F dV 


g(P) dV + 


V(t) 


■III 

B(t) 


hdV. 


Observe that g(P) = (div F)(P) is constant, and hence can be taken out 
of the first integral. The simple integral of dV over B(t ) yields the volume 
V(t), which cancels, so that the first term is equal to (div F)(P), which is 
the desired answer. 
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There remains to show that the second term approaches 0 as / approaches 
0. But this is clear: The function h approaches 0, and the integral on the 
right can be estimated as follows: 



B(t) 


^ Max 

llxr-p||£< 


hW 'm III dv 

B(t) 


< 


Max 

Il-X-Pll^i 


h(X ) 


As / 0, the maximum of h(X) for \\X — / > || ^ t approaches 0, thus 

proving what we wanted. 

The integral expression under the limit sign in the corollary can be 
interpreted as the flow going outside the sphere per unit time, in the 
direction of the unit outward normal vector. Dividing by the volume of the 
ball B(t), we obtain the mass per unit volume flowing out of the sphere. 
Thus we get an interpretation for the divergence of F at P as the rate of 
change of mass per unit volume per unit time at P. 


Exercises 

1. Compute explicitly the integrals over the top, bottom, right, and left sides 
of the box to check in detail the remaining steps of the proof of the divergence 
theorem, left to the reader in the text, as “similar arguments”. 

2. Let S be the boundary of the unit cube, 


0 ^ ^ 1, O^ygl, O^z^l. 


Compute the integral of the vector field F(x , y, z) = ( xy , y 2 , y 2 ) over the 
surface of this cube. 


3. Calculate the integral 



(curl F) • n da 


where F is the vector field 

F(x,y,z) = {-y,x 2 ,z z ), 


and S is the surface 

x 2 + y 2 + z 2 = 1, 


-i ^ z ^ l. 


Don’t make things more complicated than they need be. 

4. Find the integral of the vector field 

nx) = t 


over the sphere of radius 4. 
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Find the integral of the following vector fields over the indicated surface. 

5. F(x, y, z) — ( yz , xz, xy) over the cube centered at the origin and with sides 
of length 2. 

6. F(x, y, z) — (x 2 , y 2 , z 2 ) over the same cube. 

7. F(x, y, z) — (x — y, y — z, x — y) over the same cube. 

8. F(X) = X over the same cube. 

9. F(x, y,z) = (x y, y + z, x + z) over the surface bounded by the 
paraboloid 

z — 4 — x 2 — y 2 , 

and the disc of radius 2 centered at the origin in the (*, y)-plane. 

10. F(x, y, z) = (2jc, 3 y, z) over the surface bounding the region enclosed by 
the cylinder 

x 2 + y 2 - 4 

and the planes z = 1 and z = 3. 

11. F(x, y, z) — (x, y, z), over the surface bounding the region enclosed by the 
paraboloid z = x 2 + y 2 , the cylinder x 2 + y 2 = 9, and the plane z = 0. 

12. F(x, y, z) = (x -j- y, y + z, x + z) over the surface bounding the region 
defined by the inequalities 

0 ^ x 2 + y 2 ^ 9 and 0 ^ z 5S 5. 

13. F(x,y,z) = (3x 2 ,xy,z) over the tetrahedron bounded by the coordinate 
planes and the plane x + y + z = 1. 

14. Let / be a harmonic function, that is a function satisfying 



+ lf + ll = 0 . 

T T dz 2 


Let S be a closed smooth surface bounding a region U in 3-space. Let / be 
a harmonic function on an open set containing the region and its boundary. 
If n is the unit normal vector to the surface pointing outward, let Dafbe 
the directional derivative of / in the direction of n. Prove that 


f f D a f da = 0. 

s 


[Hint: Let F = grad /.] 

15. Assumptions being as in Exercise 14, prove that 

ff f D a f da — fff ||grad/|| 2 dV. 

s u 


[Hint: Let F = /grad /.] 
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§6. Stokes’ theorem 

We recall Green’s theorem in the plane. It stated that if S' is a plane 
region bounded by a closed path C, oriented counterclockwise, and F is 
a vector field on some open set containing the region, F = (fuff), then 

/ ((D t f 2 - D s fi) da = / F • dC. 

S 

Of course in the plane with variables (x, y ), da = dx dy. 

We can now ask for a similar theorem in 3-space, when the surface lies 
in 3-space, and the surface is bounded by a curve in 3-space. The analogous 
statement is true, and is called Stokes’ theorem: 


Stokes’ Theorem. Let S be a smooth surface in R 3 , bounded by a closed 
curve C. Assume that the surface is orientable, and that the boundary 
curve is oriented so that the surface lies to the left of the curve . Let F be 
a C 1 vector field in an open set containing the surface S and its boundary. 
Then 


//(curl 

S 


F) • n dcr 



F-dC. 



When the surface consists of a finite number of smooth pieces, and the 
boundary also consists of a finite number of smooth curves, then the 
analogous statement holds, by taking a sum over these pieces. 

We shall not prove Stokes’ theorem. The proof can be reduced to that 
of Green’s theorem in the plane by making an analysis of the way both 
sides of the formula behave under changes of variables, i.e. changes of 
parametrization. Note that Green’s theorem in the plane is a special case, 
because then the unit normal vector is simply (0, 0, 1), and the curl of F 
dotted with the unit normal vector is simply the third component of the 
curl, namely 


^ 1/2 — D 2 fi. 
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Thus Green’s theorem in the plane makes the 3-dimensional analogue 
quite plausible. 

Stokes’ theorem has an interesting consequence as follows. Suppose 
that two surfaces Si and S 2 are bounded by a curve C, and lie on opposite 
sides of the curve, as on Fig. 22. Then 



(curl F) • n da 


— JJ (curl F) • n dr 

$2 


because these integrals are equal to the integrals of F over the boundary 
curve with opposite orientations. We have also drawn separately the sur¬ 
faces Si and S 2 having C as boundary. Observe that taken together, 
Si and S 2 bound the inside of a 3-dimensional region. 



Similarly, we can consider a ball, bounded by a sphere. The two hemi¬ 
spheres have a common boundary, namely the circle in the plane as 
on Fig. 23. Note that C is oriented so that Si lies to the left of C, but S 2 
lies to the right of C. 



X 


Figure 23 
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By the divergence theorem, we know that if S denotes the union of Si 
and S 2 , then 

JJ (curl F) • nd<r = JJJ div curl F dV. 


s 


u 


However, div curl F = 0. Hence the integral above is equal to 0. This 
corresponds to the fact that 

JJ (curl F) • n da = — JJ (curl F) • n da 
Si s 2 

because the integral over Si is equal to the integral of F over C, whereas 
the integral over S 2 is equal to the integral of F over C~, which is the same 
as C but oriented in the opposite direction. 

Example. We shall verify Stokes’ theorem for the vector field 
F(x, y, z) — (z - y, x + z, -(x + ;;)), 


and the surface bounded by the paraboloid 

z = 4 — x 2 — y 2 
and the plane z = 0, as on Fig. 24. 


Z 



Figure 24 


First we compute the integral over the boundary curve, which is just 
the circle 

x 2 + y 2 = 4. 

We parametrize the circle by x = 2 cos 6 and y = 2 sin 9 as usual. Then 

F * dC = (z — y) dx + (x + z) dy — (x + y) dz 

= —2 sin 0(—2 sin 0 dd) + 2 cos 0(2 cos 0) do 
— 4 dd. 
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Consequently, 

c r 2 tt 

I F • dC = 4 dd = 87r. 

Jc Jo 

Now we evaluate the surface integral. First we get the curl, namely 


curl F = 


Ex 

e 2 

£3 

Dx 

d 2 

Dz 

z - y 

x + 2 

-x - y 


- (- 2 , 2 , 2 ). 


We can compute the normal vector as in §1, or by observing that the 
surface is defined by the equation 

f(x, y, z) = z - 4 + x 2 + y 2 = 0, 


and then finding 


so that 


Then 


grad/ (x, y, z) = (2x, 2y, 1 ), 


n 


7 = = (2x, 2 y, 1). 

V 4x 2 + 4j2 1 



curl F • nda 


JJ {—Ax + 4y + 2) dx dy , 

D 


where D is the disc defined by x 2 + y 2 ^ 4. This last integral is easily 
found to be equal to 871 -, which is, of course, the same value as the integral 
of F over the curve in the first part of the example. 

Remark. Green’s and Stokes’ theorems are special cases of higher 
dimensional theorems expressing a relation between an integral over a 
region in space, and another integral over the boundary of the region. To 
give a systematic treatment requires somewhat more elaborate foundations, 
and lies beyond the bounds of this course. 


Exercises 

Verify Stokes’ theorem in each one of the following cases. 

1. F(x, y, z) = (z, *, y), S defined by z = 4 — x 2 — y 2 , z ^ 0. 

2. F(x, y, z) = (x 2 + y, yz, x — z 2 ) and S is the triangle defined by the plane 
2 jc + y + 2z = 2 and x, y, z 0 . 
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3. F(x, y, z) = (x, z, —y) and the surface is the portion of the sphere of radius 
2 centered at the origin, such that y ^ 0. 

4. F(x,y,z ) = (x, y, 0) and the surface is the part of the paraboloid z = x 2 + y 2 
inside the cylinder x 2 + y 2 — 4. 

5. F(x, y, z) = (y x + z, z 2 ), and the surface is that part of the cone 
z 2 = x 2 + y 2 between the planes z = 0 and z = 1. 

Compute the integral jjs curl F • n da by means of Stokes’ theorem. 

6. F(x, y, z) = (y, z, x) over the triangle with vertices at the unit points 

( 1 , 0 , 0 ), ( 0 , 1 , 0 ), ( 0 , 0 , 1 ). 

7. F(x, y, z) = (jc + y, y — z, x + y + z) over the hemisphere 

x 2 + y 2 + z 2 = a 2 , z ^ 0. 



APPENDIX 


Fourier Series 


In this appendix, we discuss a little more systematically the scalar 
product in the context of spaces of functions. This may be covered at the 
same time that Chapter I is discussed, but I place the material as an 
appendix in order not to interrupt the discussion of ordinary vectors after 
Chapter I. 


§1. General scalar products 

Let V be the set (also called the space) of continuous functions on some 
interval, say the interval [ — ir, x] which is of interest in Fourier series. We 
define the scalar product of functions/, g in V to be the number 

if* 8 > = r f(x)g(x) dx. 

J —7T 

This scalar product satisfies conditions analogous to those of Chapter I, 
namely: 

SP 1. We have (v, w) = (w, v) for all v, w in V. 

SP 2. Ifu,v, w are elements of V, then 

<«, u + w) = (u, v ) + {u, w). 

SP 3. If x is a number, then 

(.XU , V ) = x(u, V ) = (u, xv). 

SP 4. For all v in V we have (v, v) ^ 0, and (v, v) > 0 if v ^ 0. 

The verification of these properties amounts to recalling simple properties 
of the integral. For instance, for SP 1, we have 

if, g) = r f(x)g(x) dx = r g(x)f(x) dx = (g, /). 

J —7T J —7T 

We leave the verification of SP 2 and SP 3 as exercises. To prove SP 4, 
suppose that/ is a non-zero function, This means that there exists some 
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point c in the interval [— t, it] such that /(c) 5 ^ 0. Then 

</» /) = r f( x f dx, 

J —7T 

and f(x) 2 is a function which is always ^ 0 , and such that 

/(c ) 2 > 0 . 

Thus the graph of f(x) 2 may look like this. 



Figure 1 


Let p(x) = f(x) 2 . Geometrically, the integral of p(x) from — 7 r to ir is 
the area under the curve y = p(x) between — w and 7r, and this area cannot 
be 0 since p(c) > 0, so the area is > 0. We can give a more formal 
argument by observing that by continuity, there is an interval of radius r 
around c and a number s > 0 such that 


p(x) ^ 5 


for all x in this interval. Then by the definition of the integral according 
to lower sums, 



dx ^ rs > 0 . 



Figure 2 


All the discussion of Chapter I which was carried out using only the 
four properties SP 1 through SP 4 is not seen to be valid in the present 
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context. For instance, we define elements v, w in V to be orthogonal, or 
perpendicular, and write v J_ w, if and only if (u, w) = 0. We define the 
norm of v to be 

\\v\\ - V{V, v). 

Remark. In analogy with ordinary Euclidean space, elements of V 
are also sometimes called vectors. More generally, one can define the 
general notion of a vector space, which is simply a set whose elements can 
be added and multiplied by numbers in such a way as to satisfy the basic 
properties of addition and multiplication (e.g. associativity and com¬ 
mutativity). Continuous functions on an interval form such a space. In 
an arbitrary vector space, one can then define the notion of a scalar 
product satisfying the above four conditions. For our purposes, which is 
to concentrate on the calculus part of the subject, we work right away in 
this function space. However, you should observe throughout that all 
the arguments of this section use only the basic axioms. Of course, when 
we want to find the norm of a specific function, like sin 3x, then we use 
specifically the fact that we are working with the scalar product defined 
by the integral. 

We shall now summarize a few properties of the norm. 

If c is any number, then we immediately get 

IMI = \c\ imi, 

because 


\\cv\\ = \{cv,cv) = Vc 2 (u, v) = |cj ||u||. 

Thus we see the same type of arguments as in Chapter I apply here. In 
fact, any argument given in Chapter I which does not use coordinates 
applies to our more general situation. We shall see further examples as 
we go along. 

As before, we say that an element re K is a unit vector if ||u|| = 1. 
If v e V and r ^ 0, then v/\\v\\ is a unit vector. 

The following two identities follow directly from the definition of the 
length. 

The Pythagoras theorem. If v, w are perpendicular, then 

||u 4- w|| 2 = ||u|| 2 + ||h>|| 2 . 

The parallelogram law. For any v, vv we have 

\\v + w|| 2 4- ||u — w|| 2 = 2||u|| 2 4- Sun’ll 2 . 

The proofs are trivial. We give the first, and leave the second as an exer- 
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rise. For the first, we have 

||u -f- w|| 2 = (v + vp, v -f- vp) = (v, v) + 2 {v, vp) + (vp, vp) 

= \\v\\ 2 + ||vp|| 2 . 

Let w be an element of V such that ||vp|| ^ 0. For any v there exists 
a unique number c such that v — cvp is perpendicular to vp. Indeed, for 
v — cvp to be perpendicular to vp we must have 

(v — cvp, vp) = 0, 


whence (v, vp) — (cvp, vp) = 0 and (v, w) = c(vp, vp). Thus 

, = . 

(vp, vp) 


Conversely, letting c have this value shows that v — cvp is perpendicular 
to vp. We call c the component of v along vp. This component is also called 
the Fourier coefficient of v with respect to w, to fit the applications in the 
theory of Fourier Series. 

In particular, if vp is a unit vector, then the component of v along vp is 
simply 

c = (c, vp). 


Example. Let V be the space of continuous functions on [—7r, 7r]. 
Let/be the function given by/(x) = sin kx, where k is some integer > 0. 
Then 


= V(f,f) = (f sin2 kx dx ) 1 

= \fi r. 


/ 2 


If g is any continuous function on [— v, 7r], then the Fourier coefficient of 
g with respect to/ is 


(gj) 1 

</,/> * 



g(jc) sin kx dx. 


Let c be the component of v along vp. As with the case of ra-space, we 
define the projection of v along vp to be the vector cvp, because of our 
usual picture: 


v — cw 



Figure 3 
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Exactly the same arguments which we gave in Chapter I can now be 
used to get the Schwarz inequality, namely: 

Theorem 1. For all v, w £ V we have 

\(v, w)| ^ \\v\\ HI. 

Proof. If w = 0, then both sides are equal to 0 and our inequality is 
obvious. Next, assume that w = e is a unit vector, that is e e V and 
||c|| = 1. If c is the component of v along e, then v — ce is perpendicular 
to e, and also perpendicular to ce. Hence by the Pythagoras theorem, 
we find 

\\v\\ 2 = \\v — ce|| 2 + ||ce|| 2 
= \\v — ce|| 2 + c 2 . 


Hence c 2 ^ ||c|| 2 , so that |c| ^ ||c||. Finally, if w is arbitrary 5 ^ 0, then 
e = w/||vv|| is a unit vector, so that by what we just saw, 


This yields 


as desired. 




\(v, w)| ^ ||c|| ||w 


Theorem 2. If v, w e V, then 

||c + w 11 ^ ||u|| + IIw||. 

Proof. Exactly the same as that of the analogous theorem in Chapter 

I, §4. 

Let Ci, . . . , v n be non-zero elements of V which are mutually perpen¬ 
dicular, that is { Vi, Vj) = 0 if i ^ j. Let cy be the component of v along v { . 
Then 


V - CiUx - • • • - c n v n 

is perpendicular to v\, . . ., v n . To see this, all we have to do is to take 
the product with Vj for any j. All the terms involving (v t , Vj) will give 0 
if i 9 ^ j, and we shall have two remaining terms 

(v, Vj) - Cjipj, Vj) 

which cancel. Thus subtracting linear combinations as above orthog- 
onalizes v with respect to v x , ... ,v n . The next theorem shows that 
C 1 C 1 ■ + c n v n gives the closest approximation to v as a linear 
combination of t?i, . . . , v n . 
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Theorem 3. Let v i} ..., v n be vectors which are mutually perpen¬ 
dicular, and such that ||i;*|| ^ 0 for all i. Let v be an element of V, 
and let Ci be the component of v along V{. Let a x ,. . . ,a n be numbers. 
Then 


v 


n 


E 


CkVk 


< 


n 

v - X) a kVk 
1 


Proof\ We know that 

71 

v - C kVk 

k= 1 


is perpendicular to each v t , i — 1 , ... ,n. Hence it is perpendicular to 
any linear combination of v x , . . ., v n . Now we have: 

Ik - = II V ~ Y,c k v k + X)( c fc - a k )v k \\ 2 

= Ik - E c **>*l| 2 + |E( c fc - ak)v k \\ 2 

by the Pythagoras theorem. This proves that 

Ik - H c kVk \\ 2 ^ Ik - HakV k II 2 , 

and thus our theorem is proved. 

The next theorem is known as the Bessel inequality. 

Theorem 4. If v k ,..., v n are mutually perpendicular unit vectors, and 
if Ci is the Fourier coefficient of v with respect to then 


E e? £ Ikll 2 - 

1=1 

Proof. We have 

0 ^ (v — J^CiVi, v - Zavi) 
= (p, v) - H^Ci{v, Vi) + 

= <k> v ) - Zkf. 

From this our inequality follows. 


Exercises 

1. Prove SP 2 and SP 3, using simple properties of the integral. 

2. Let f\,. . . ,f n be functions in V which are mutually perpendicular, that is 

(f,fj) = 0 if i^j, 

and assume that none of the functions yi is 0. Let ci,. . . ,c n be numbers 
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such that 

Cl/l + • • • + Cnfn = 0 

(the zero function). Prove that all c, are equal to 0. 

3. Let / be a fixed element of V. Let W be the subset of elements h in V such 
that h is perpendicular to /. Prove that if hi, hi lie in W, then hi + hi lies 
in W. If c is a number and h is perpendicular to / prove that ch is also 
perpendicular to f. 

4. Write out the inequalities of Theorem 1 and Theorem 2 explicitly in terms of 
the integrals. Appreciate the fact that the notation of the text, following that 
of Chapter I, gives a much neater way, and a more geometric way, of 
expressing these inequalities. 

5. Let m, n be positive integers. Prove that the functions 

1, sin nx, cos mx 

are mutually orthogonal. Use formulas like 

sin A cos B = ^[sin (A + B) + sin (A — B)], 
cos A cos B = ^[cos (A + B) + cos (A — 5)]. 

6. Let <p n (x) = cos nx and \l/ n (x) = sin nx, for a positive integer n. Let o 
be the function such that yjoOO = 1, i.e. the constant function 1. Verify by 
performing the integrals that 

||<P«|| = HiMI = V^r and ||?o|| = \Z2tt. 

7. Let V be the set of continuous functions on the interval [0,1]. Define the 
scalar product in V by the integral 


(f s) = [ f(x)g(x) dx. 

Jo 

(a) Prove that this satisfies conditions SP 1 through SP 4. How would you 
define ||/|| in the present context? 

(b) Let /(*) = * and g(*) = x 2 . Find (/, g). 

(c) With/, g as in (b), find ||/|| and ||g||. 

(d) Let h(x) = 1, the constant function 1. Find {/, h), (g, h), and ||/z||. 


§2. Computation of Fourier series 


In the previous section we used continuous functions on the interval 
[—7r, 7r]. For many applications one has to deal with somewhat more 
general functions. A convenient class of functions is that of piecewise 



346 


FOURIER SERIES 


[app., §2] 


continuous functions. We say that / is piecewise continuous if it is con¬ 
tinuous except at a finite number of points, and if at each such point c the 
limits 

lim f{c — h) and lim f(c + h) 

h-> 0 A->0 

h >0 h >0 

both exist. The graph of a piecewise continuous function then looks 
like this: 


Figure 4 

Let V be the set of functions on the interval [— 7 r, 7 r] which are piece- 
wise continuous. If/, g are in V, so is the sum/ + g. 

If c is a number, the function cf is also in V, so functions in V can be 
added and multiplied by numbers, to yield again functions in V. Further¬ 
more, if/, g are piecewise continuous then the ordinary product fg is also 
piecewise continuous. We can then form the scalar product {/, g) since the 
integral is defined for piecewise continuous functions, and the three 
properties SP 1 through SP 3 are satisfied. However, the scalar product is 
not positive definite. A function/ which is such that/(x) = 0 except at a 
finite number of points has norm 0 . 

Thus it is convenient, instead of SP 4, to formulate a slightly weaker 
condition: 

Weak SP 4. For all v in V we have (v, v) 0. 

We then call the scalar product positive (not necessarily definite). 

We define the norm of an element as before, and we ask: For which 
elements of V is the norm equal to 0? The answer is simple. 

Theorem 5. Let V be the space of functions which are piecewise continuous 

on the interval [— tt, 7 r]. Let f be in V. Then ||/j| = 0 if and only if 

fix) = 0 for all but a finite number of points x in the interval. 

Proof. First, it is clear that if f{x) — 0 except for a finite number of x, 
then 

ll/ll 2 = f fixfdx = 0 . 

J —7T 

(Draw the picture of f(x) 2 .) Conversely, suppose/is piecewise continuous 
on [— 7 r, 7 r] and suppose we have a partition of [ — 7 r, 7 r] into intervals 
such that /is continuous on each subinterval a i+ j] except possibly at 
the end points a i9 i = 0, . . . , r — 1. Suppose that ||/|| = 0, so that also 
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||/|| 2 = 0 = (/, /). This means that 

[ fix) 2 dx = 0, 

J —7f 

and the integral is the sum of the integrals over the smaller intervals, so 
that 

r-1 f a '+ 1 

X / /(*) 2 dx = 0. 

2=0 •'a/ 

Each integral satisfies 

r +i nx) 2 dx^o 

Jai 

and hence each such integral is equal to 0. However, since/is continuous 
on an interval [a it a iJt x ] except possibly at the end points, we must have 
f(x ) 2 = 0 for Ui < x < Uii, whence f(x) = 0 for a, < x < a t - + i. 
Hence /(x) = 0 except at a finite number of points. 

The space V of piecewise continuous functions on [ —x, x] is not finite 
dimensional. Instead of dealing with a finite number of orthogonal vec¬ 
tors, we must now deal with an infinite number. 

For each positive integer n we consider the functions 

Vn(x) = cos nx, tn(x) = sin nx, 
and we also consider the function 

*o(x) = 1- 

It is verified by easy direct integrations that 

llv’nll = ll^nll = VV if n 0, 

Ikoll = \/2 x. 

Hence the Fourier coefficients of a function / with respect to our func¬ 
tions 1, cos nx, sin nx are equal to: 


a o = 


2 v J _ 


/' 

J — 7T 


fix) dx, 


An 


7T 


/(x) cos nx afx, 



x 



/(x) sin nx dx. 


Furthermore, the functions 1, cos nx, sin mx are easily verified to be 
mutually orthogonal. In other words, for any pair of distinct functions 
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/, g among 1, cos nx, sin mx we have (f g ) = 0. This means: 
If m 9^ n and n ^ 0, then 

/ IT r 7T 

cos nx cos mx dx = 0, f sin nx sin mx = 0; 
—IT J—TT 

and for any m, n: 


r 

J —7T 


cos nx sin mx dx = 0. 


The verifications of these orthogonalities are mere exercises in elementary 
calculus, which you should have already done in §1. 

The Fourier series of a function/ (piecewise continuous) is defined to 
be the series 

oo 

u 0 + X (°k cos kx -j- b k sin kx). 
k= 1 

The partial sum 

n 

s n (x ) = a 0 + (<»* cos kx + b k sin kx) 

is simply the projection of the function / on the space generated by the 
functions 1, cos kx, sin kx for k = 1,. . . , n. In the present infinite 
dimensional case, we write 

oo 

/ ~ a 0 + X) (°k cos kx + b k sin kx). 

jt =i 

The sense in which one can replace the sign ~ by an equality depends on 
various theorems whose proofs go beyond this course. One of these 
theorems is the following: 

Theorem 6. Assume that the piecewise continuous function f on [—7r, 7 r] 
is orthogonal to every one of the functions 1, cos nx , sin nx. Then f(x) = 0 
except at a finite number of x. If f is continuous, then f = 0. 

Theorem 6 shows at least that a continuous function is entirely deter¬ 
mined by its Fourier series. There is another sense, however, in which 
we would like/ to be equal to its Fourier series, namely we would like the 
values f(x) to be given by 

00 

/(*) = 0 + Yj ( a k COS kx + b k sin kx) 

k=l 

n 

= a 0 4- lim X) ( a k cos kx + b k sin kx). 

n—>00 k—\ 

It is false in general that if / is merely continuous then f(x) is given by 
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the series. However, it is true under some reasonable conditions, for 
instance: 

Theorem 7. Let — tt < x < ir and assume that f is differentiable in some 
open interval containing x, and has a continuous derivative in this interval. 
Then f (x) is equal to the value of the Fourier series. 

Example 1. Find the Fourier series of the function / such that 

/(x) =0 if —7r < X < 0, 

/(x) =1 if 0 < x < 7r. 

The graph of/ is as follows. 


1 

1 


1 

i n 

— 7T 

\ 

7r 

Figure 5 


Since the Fourier coefficients are determined by an integral, it does not 
matter how we define /at — tt, 0, or tt. We have 


a 0 


a n 


b 


n 



/(x) dx = 

-IT 

cos nx dx = 



0, 


]_ 
7r 



sin nx dx — -— (—cos nx ) 
7 m 


7r 

0 


0 if n is even, 
2 

— if n is odd. 

,im 


Hence the Fourier series of/ is: 


/(*) 


+ £ 


(2m 4- H7 


sin (2 m + l)x. 


By Theorem 7, we know that /(x) is actually given by the series except at 
the points — 7r, 0, and tt. 

Example 2. Find the Fourier series of the function / such that 
/(x) = — 1 if — 7r < x < 0 and /(x) = x if 0 < x < tt. 
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The graph of/ is as follows. 



Figure 6 


Again we compute the Fourier coefficients. We evaluate the integral 
over each of the intervals [ — tt, 0] and [0, ir] since the function is given by 
different formulas over these intervals. We have 


1 


1 


flo = TZ / ( -1 ) fi?X + 2 7r / xdx= T + J 


a n = 


27T y_, r 

/ 0 

(- 

-TT 


/ 


1 


1) cos nx dx + 


i r 


x cos nx dx 


I 0 if /i is even, 

1 — if h is odd, 

l 7T« 2 


b n 


1 r0 j rir 

- / (— 1) sin «x dx H— / xsmnxdx 

7r y—7F tt Jo 

if n is even, 


— < 


1^ 

n 


— + - if n is odd. 
\im n 


Thus we obtain: 

1 n 

/(x) = ^ + J + x («A: COS &x + 6* sin fcx). 

2 4 fc=i 

The equality is valid for — ir < x < 0 and 0 < x < ir by Theorem 7. 

Example 3. Find the Fourier series of the function sin 2 x. 

We have 


. 2 1 — cos 2x 1 1 ~ 

sin x — - 2 - = 2 — 2 C0S 


This is already written as a Fourier series, so the expression on the right 
is the desired Fourier series. 
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A function/ is said to be periodic of period if we have 

fix + 2tt) = fix) 

for all x. For such a function, we then have by induction /(x + 2irn) = f (x) 
for all positive integers n. Furthermore, letting t = x + 2-k, we see 
also that 

fit - 2tt) = fit) 

for all t, and hence fix — 2-n-n) = fix) for all x and all positive integers n. 

Given a piecewise continuous function on the interval — tt ^ x < tt, 
we can extend it to a piecewise continuous function which is periodic of 
period 2 tt over all of R, simply by periodicity. 

Example 4. Let fix) = x on — tt ^ x < tv. If we extend / by perio¬ 
dicity, then the graph of the extended function looks like this: 


Figure 7 

Example 5. Let / be the function on the interval —tt^x<tt given 
by: 

fix) =0 if —7r ^ x ^ 0, 

fix) = 1 if 0 < x < 7r. 

Then the graph of the function extended by periodicity looks like this: 




Figure 8 


Example 6. Let /be the function on the interval —tt ^ x < tt given 
by fix) = e x . Then the graph of the extended function looks like this: 
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On the other hand, we may also be given a function over the interval 
[0, 2 w] and then extend this function by periodicity. 

Example 7. Let /(x) = x on the interval 0 ^ x < 2ir. The graph of 
the function extended by periodicity to all of R looks like this: 



Figure 10 


This is different from the function in Example 4, since in the present case, 
the extended function is never negative. When the function is given on 
an interval [0, 27 t], we compute the Fourier coefficients by taking the 
integral from 0 to 27 t. In the present case, we therefore have: 



it. 


"2 x 


a n = ~ 


x cos nxdx = 0 


for all ft, 


b 


n 


l 

IT 



x sin fix dx 


2 

n 


Hence we have, for 0 < x < 2i r: 


x = 


7T — 


2 


^sin x + 


sin 2x 
~2~ 


+ 


sin 3x 



Exercises 

1. (a) Let fix) be the function such that f(x) = 2 if 0 ^ x < ir and f(x) = 

— 1 if —ir ^ x < 0. Compute ||/||. 

(b) Same question, if f(x) = x for 0 ^ x < ir and f(x) = — 1 for 

— 7T ^ x < 0. 

2. Iff is periodic of period 2ir and a, b are numbers, show that 

rb rb-{-2 tt rb —2 IT 

/ f(x) dx = I fix) dx = j f(x) dx. 

Ja Ja-\-2Tr Ja —2?r 
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[Hint: Change variables, letting u — x — 2x, du = dx .] Also, prove: 


[ /(a + a) dx — [ f(x) dx = f + /(a) dx. 

J — 7T J —7T J TT~]-a 

[Hint: Split the integral over the bounds — tt + a, —7 r, tt,tt + a.] 

3. Let/be an even function, that is /(a) = f( — x), for all x. Assume that / is 
periodic of period 2x. Show that all its Fourier coefficients with respect 
to sin nx are 0. Let g be an odd function (that is g( —a) = — gOt)). Show 
that all its Fourier coefficients with respect to cos nx are 0. 

4. Compute the Fourier series of the functions, given on the interval 
— ir < x < 7r by the following /(a): 

(a) a (b) a 2 (c) |a| (d) sin 2 a 

(e) |sinA| (f) |cosa| (g) sin 3 a (h) cos 3 a 

5. Show that the following relations hold: 

(a) For 0 < a < 2tt and a ^ 0, 


ax / 2aw 

ive — ye 



a cos kx — k sin kx 
k 2 + a 2 


) 


(b) For 0 < a < 27r and a not an integer, 


7 rcos ax 


sin 2fl7r ( a sin 2 a7t cos kx + A:(cos 2air — 1) sin kx 
Ya ^ a 2 - k 2 


(c) Letting a = tt in part (b), conclude that 


_ 2 v* (~ 1) 


sin air 


= 1+2 


tZ\ a 2 - k 2 


(d) For 0 < a < 27t, 


(7T — A) 2 _ 71-2 , V' COS kx 

= Tl + h 


4 



Answers to Exercises 


I am much indebted to Mr. Mitchell Luskin for the answers to the exercises. 


Chapter 7, §7 



/I + B 

A — B 

3/1 

-2 B 

1. 

(1,0) 

(3, -2) 

(6, -3) 

(2, -2) 

2 . 

(-1,7) 

(-1, -1) 

(-3, 9) 

(0, -8) 

3. 

(1, 0, 6) 

(3, -2, 4) 

(6, -3, 15) 

(2, -2, -2) 

4. 

(-2,1, -1) 

(0, -5, 7) 

(-3, -6, 9) 

(2, -6, 8) 

5. 

(3tt, 0, 6) 

(-7T, 6, -8) 

(3tt, 9, -3) 

(—4t r, 6, -14) 

6 . 

(15 + it, 1, 3) 

(15 - 7 r, -5, 5) 

(45, -6, 12) 

(—2t r, -6, 2) 


Chapter 7, §2 

1. No 2. Yes 3. No 4. Yes 5. No 6 . Yes 7. Yes 8 . Yes 


Chapter 7, §J 

1. (a) 5 (b) 10 (c) 30 (d) 14 (e) tt 2 + 10 (f) 245 

2. (a) -3 (b) 12 (c) 2 (d) -17 (e) 2x 2 - 16 (f) 1 5tt - 10 
4. (b) and (d) 7. <//) = f , (gg) = f, (fg) = 0 


Chapter 7, §4 

1 . (a) Vs (b) VlO (c) a/30 

(d) VU (e) + it 2 (f) V245 

2 . (a) Vl (b) 4 (c) V 3 

(d) V26 (e) V58 + 4tt 2 (f) VlO + tt 2 

3. (a) (f, -|) (b) ( 0 , 3) 


(c) §) 


(d) ( 26 , ~ 26 , ft) ( e ) 


7T 2 — 8 


2tt 2 + 29 


(27r, -3, 7) (f) 1 #—(tt, 3, - 1) 


4. (a) (-f, f) 


(b) (-!,¥-) 


10 + TT 2 

(c) (1%, -A, t) 


(d) -H(-l, -2,3) (e) -—— 7 — (tt, 3,-1) (f) ^—^(15,-2,4) 


5. (a) 


14' 

35 


IT 2 + 10 

0 (b) ! 


49 

25 


V41-35 V 4 T • 6 V\7 • 26 V 4 \ • 17 V 26 • 41 

13. 0, 0 14. V2 15. > Vtt 16. V2tt 17. 
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Chapter I, §5 


1. X = (1, 1, -1) + /(3, 0, -4) 2. X = (-1, 5, 2) + r(-4, 9, 1) 

3. y = x + 8 A. 4y = 5x — 1 6. (c) and (d) 

7. (a) jc — y + 3z = — 1 (b) 3x + 2y — 4z =■ 2tt + 26 
(c) x — 5z = — 33 

8. (a) 2x + y + 2z = 7 (b) 7* - 8>- - 9z = -29 

(c) t + z = 1 

9. (3, —9, —5), (1, 5, —7) (Others would be constant multiples of these.) 

10. (a) 2 (t 2 + 5) 1/2 11. (15t 2 + 261 + 21) 1/2 , V146/15 

12. (-2, 1, 5) 13. (11, 13, -7) 

14. (a) X = (1,0, -1) + /(-2, 1, 5) 

(b) * = (-10, -13, 7) + /(11, 13, -7) 


15. (a) -i 




(d) - 


2 

Vl8 


16. (a) (-4, V-, ¥) (b) (ff, fg, ~j%) 17. ( 1 , 3, - 2 ) 18. 2/V3 


20 . (a) -^= (b) 

V 35 V21 

21. (a) (-§, 4, i) (b) (-1, -V-. 0), (-i Af, 1) 

(c) ( 0 , Y, -!) (d) (- 1 , i) 22 . 


Chapter /, §6 

1. (-4, -3, 1) 2. (-1,1, -1) 3. (-9, 6,-1) 4.0 
5. £ 3 , £ 1 , £2 in that order 
7. (0,-1, 0) and (0, 0, 0); no 

9. (a) 2 VY 3 I (b) V245 (c) V470 (d) V38I 


Chapter II, §7 
1. (e\ —sin t, cos t) 


2 . 


^2 cos 2 1 , 


1 

1 + t 



3. (—sin t, cos t ) 


4. ( — 3 sin 3 1, 3 cos 30 7. J3 

8 ‘ G’ ~f) + f O’ ~1~) ’ ^ -1, + ^ _1, or y = FF, t = 0 

9. ex + y -f- 2z = e 2 -f 3 10. x + t = 1 

11. VU(7) - Q) (X(T) - Q) 


16. (a) ( 0 , 1 , tt/ 8 ) +_/(—4, 0 , 1 ) (b) ( 1 , 2 , 1 ) + r(l, 2 , 2 ) 

(c) (e 3 , e“ 3 , 3V2) + t(3e\ -le~\ 3V2) (d) (1, 1, 1) + /(1, 3, 4) 
19. (2, 0, 4) and (18, 4, 12) or when t — —3 or 1 


27. no = ^ = ^(0 x noi 


= A"(0 X X\t) + X(t) X *"(0 = X{t) X X"(0 
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28. Y\t) = ^ = ^[X(t)-(X\t) X *"(/))] 

= no-(^'(o x *"(o) + x{t)-j t [x\t) x r r (o] 

= Z(0* (*'(0 x 

Chapter II, §2 
1. V2 2. 2Vl3 

3. (a )\VTl (b)|<V4T-l) + t(log^-±^) (c) e | 

4. (a) 8 (b) 4 - 2V2 _ ( 

5. (a) V5 - V2 + log 2 + (b) V26 - VlO + log | ( ] + ^ 

1 + V5 \1 + V26 

6. Log (V2 + 1) 


Chapter II, §5 

3. V 


r 2 + c 


4. (a) The norm of —^|(1, 2, 3) + y(0, 1, 3) (b) 2 
(c) The norm of ^(1, — 2, 3) + y(0, 1, —3). 

|/"( 0 | 


6. k(t) = 
8 . (a) 


(1 + (/') 2)3/2 

I —sin t\ 


7. (1 + t 2 f 2 . V2 
* min = — 


t 


(1 + cos2 0 3/2 


9. 


(1 + 4r 2 ) 3/2 


10 . 


(b) 1 (c) ft 

(n 2 sin 2 r + b 2 cos 2 r) a 




11. |/7r| 12. S - 14. ^ 


Chapter ///, §2 



df/dx 

df/dy 

df/dz 

1. 

y 

X 

1 

2. 

2 xy 5 

5 x 2 y 4 

0 

3. 

y cos Cry) 

x cos(xy) 

— sin(z) 

4. 

— y sin(xy) 

—x sin(jcy) 

0 

5. 

yz cos(;eyz) 

xz cos(xyz) 

xy cos(xyz) 

6. 

yze xyz 

xze xyz 

xye xyz 

7. 

2 x sin(^z) 

x 2 z cos (yz) 

x 2 y cos(yz) 

8. 

yz 

xz 

xz 

9. 

z + y 

Z + X 

x + y 

10. 

cos(y — 3z) 

—x sin(y — 3z) 

3x sin(y — 3z) 


+ y 

+ a: 



Vl — x 2 y 2 

Vl — x 2 y 2 
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11. (1) (2, 1, 1) (2) (64, 80, 0) (6) (6e 6 , 3e 6 , 2e 6 ) 

(8) (6, 3, 2) (9) (5, 4, 3) 

12. (4) (0, 0, 0) (5) ( 7 r 2 cos 7 r 2 , w cos 7r 2 , 7 r cos 7 r ) 

2 2 2 

(7) (2 sin 7T , 7T COS 7T , 7T COS 7r ) y 

13. (-1, -2, 1) 14. = yx"- 1 

15* ( —cos 7 r , — 7 r<? sin 7 r , — we sin x ) 


, 1 

—— = x In x 
dy 


\6 (3 X 5\ 
\ 2 > 25 2 ) 


Chapter III, §i 


1. 2,-3 2. a, b 3. a, b, c 

7. lim g(h, k) = — 1 lim T lim g(//, £)1 

0 A->0 La->o J 


lim g(//, A:) = 1 

A—>0 


lim f" lim g(h, A:)l 

0 |_a->o J 


Chapter IV, §7 

dz _ df du df dv 
dr dx dr dy dr 

2. (a) ~ = 3x 2 + 3yz, 
dx 


j dz _ df du df dv 
dt dx dt dy dt 

= 3xz — 2yz 


(3x 2 + 3yz) + (3xz - 2yz)(-l) + (3xy — y 2 )2i 
j t = (3x 2 + 3yz)2 + (3xz - 2yz)(-l) + (3xy - y 2 )2f 


^ = y2 + 1 , d l = x '- + 1 

dx (1 — xy) 2 dy (1 — xy) 2 

df _ (x 2 + l)sin(3f — s) 
ds (1 — xy) 2 

df _ 2(y 2 + l)cos2l — 3(x 2 + l)sin(3f — j) 
dt (1 — xy) 2 

__*_ } df = _ y _ 

(x 2 + y 2 + z 2 ) 1 ^ 2 dy (x 2 + y 2 + z 2 ) 1 / 2 


* 2 + 1 


3 df _ _x_ 

dx (x 2 + y 2 + z2)!/2 


9. (a) -X/r 3 (b) 2X (c) -3 X/r 5 (d) 2e~ T *X (e) - X/r 2 


(f) -4 mX/r m+2 


Chapter IV, §2 

Plane 

1 

Line 

(a) 

6x + 2y + 3z = 49 

* = (6, 2, 3) + 1(12, 4,6) 

(b) 

x + y + 2z = 2 

* = (1,1,0) + 1(1, 1,2) 

(c) 

13x + 15 y -h z = -15 

X = (2, -3,4) + 1(13, 15, 1) 

(d) 

6x — 2y + 15z = 22 

X = (1,7, 2) + /( —6, 2, -15) 

(e) 

4x + y + z = 13 

* = (2, 1, 4) + 1(8, 2, 2) 

(0 

z = 0 

X = (1, tt/2, 0) + /(0, 0, tt/2 + 1) 
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2. (a) (3, 0, 1) (b) * = (log 3, - 3) + /(3, 0, 1) 

(c) 3x + z == 3 log 3 — 3 

3. (a) X = (3, 2, -6) + t(2, -3, 0) (b) X = (2, 1, -2) + /(—5, 4, -3) 

(c) * = (3, 2, 2) + 1 ( 2 , 3, 0) 4. V(X(t) - Q) 2 

5. (a) 6x + 8y - z = 25 (b) 16x + 12? - 125z = -75 

(c) TTA* + ^ + 2 = 27T 6 . x—2j^ + z=1 


Chapter IV , §3 
!• (a) § (b) max = VlO 


min 


= -vlo 


2. (a) - (b) f§ (c) 2V/145 

2\/ 5 


9V3 3V3' 


3. Increasing I — 


)• 


decreasing 


( 


4. (a) 




0 


67/4 2 • 67/4 2 • 6 7/ 4 

5. 3* + 5? + 4z = 18 6. 6V6 7. V2 


\ 2 

(b) (1,2,-1,1) 


9V3 3V3 
2 




Chapter IV, §4 


1. log ||*|| 2. 


_1 

2r 2 


3. log r. 


k = 2 


1 


,(2 - &) r fc + 2 


k * 2 


Chapter V, §7 

1. No 2. No 3. No 4. No 5. No 6. No 


Chapter V, §2 


1. Di\p(x,y) = e r2/ , D 2 \p(x, y) 



e xy 


e 


y 


y 2 


2. D } \P(x,y) = cos(xy), D 2 \p(x, y) 

3. Di\p(x, y) = (y -j- x) 2 

D 2 ^P(x, ?) = 2 yx — 2y + x 2 — 1 

5. D]\J/(x, ?) = e' J ~ T - 

D 2 \p(x,y) = —e v ~ x + e y ~ x 

„ r, ,, ^ lOg(AfT) 

7. D\\p(x,y) = - 

* 

D 2 \J/(x, y) = x log Cry) - 1 


- cos (xy) 

y 


cos (xy) 

~~y 2 ~~ 


4. D\ip(x, y) = e y+x 

D 2 ^/(x, ?) = e y+x — e v+1 


6 . Dpp(x,y) = x 2 y :i 
D 2 \p(x, y) = y 2 x 2 
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8. D$(x, y) = sin (3xy) 


D 2 i(x,y) 


cos 3 xy — cos 3 y 
3^2 


+ 


3x sin 3 xy — 3 cos 3 y 

y 


Chapter V, §3 

1. No 2. No 3. No 4. No 

n-J-2 

5. (a) r (b) log r (c) T if n X -2 

n + 2 

6. 2* 2 y 7. * sin 8. x 3 y 2 

9. .x 2 + y 4 10. (a) e xy (b) sin xy 

11. g(r) 

12. Given the vector field F = (f\, in w-space, defined on a rectangle, 

[a\,b\] X • • • X [ a n ,b n ]. Assume that 


dfi = M 

dx, djr,: 


(or Djfi = Difj) 


for all indices /', j. For n — 3, define ( x, y, z ) to be 

( /i(',y, z)dt+ f f 2 (ai, t, z) dt + f a 2 , t) dt, 

J aj Ja 2 ./ag 

and similarly for n variables. Using the hypothesis and the fact that a 
partial derivative of parameters can be taken in and out of an integral, you 
will find easily that <p is a potential function for F. 

Conversely, given a vector field F = (f i,..., /«) on an open set U, if 
there exists a potential function, and if the partial derivatives of the func¬ 
tions fi exist and are continuous, then the relations 


dfi = dfi 
dxj dx i 


must be satisfied for all i,j, for the same reason as that given in the text for 
two variables. This generalizes Theorem 4. 


13. (a) * 2 + § y 2 + 2z 2 

(b) a + y + z 

(c) xe v+2z 

(d) xy sin z 


(e) xyz + z :{ y 

(f) xe yi 

(g) xz 2 + y 2 

(h) z sin xy 


Chapter V, §4 

1. -- 3 ^ 2. ^ 3. 0 4. 0 5. 54 6. V3C/2 7. § 8. -tt - § 

9. rfc 10. 47r 

11. (a) 3 tt/ 4 (b) 0 (c) 0 (d) 0 

12. -tt/ 2 13. 56 15. | 16. 3 17. (a) 4 (b) 3 18. 8 20. 1 - c~ 2,r 
22. (a) No (b) T (c) 0 
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Chapter VI, §i 



d 2 f/dx 2 

d 2 f/dy 2 

d 2 f/dx dy 

1. 

y 2 e xv 

x 2 e xy 

yxe xv + e xy 

2. 

—y 2 sin xy 

— x 2 sin xy 

—xy sin xy + cos xy 

3. 

2y 3 

6 x 2 y 

6 xy 2 + 3 

4. 

0 

2 

2 

5. 

2 e x 2 +y 2 _|_ 4 x 2 e x 2 +v 2 

e x2 + y2 (2 + Ay 2 ) 

4 xye x2 + y2 

6. 

2 cos(x 2 + y) 

—4x 2 sin(x 2 + >>) 

—sin(* 2 +>>) 

— 2x sin(x 2 + y) 

7. 

— (3* 2 + y) 2 cos(* 3 + xy) 
— 6*sm(* 3 + xy) 

— x 2 cos(* 3 + xy) 

— (3.x 2 -f y)x cos(x 3 + xy) 
— sin(je 3 + xy) 


a 2 / _ 2(1 + (* 2 - 2 xyf) - 2(2* - 2 yf(x 2 - 2 xy) 

8 x % (1 + (* 2 — 2 * y ) 2 ) 2 

d 2 f [1 + (* 2 - 2xyf](-2) + 2 x[2(* 2 - 2xy)(-2x)] 
y 2 [1 + (* 2 - 2*^)2]2 

a 2 / — 2[1 + (x 2 — 2xy) 2 ] — (2* - 2y)(x 2 — 2xy)(-2*)2 


8x dy [1 + (*2 _ 2*_y) 2 ] 2 

9. All three = e T+y 10. All three = —sin (* + t) 

11. 1 12. 2x 13. e xyz (\ + 3 xyz -f- x 2 y 2 z 2 ) 

14. (1 — x 2 y 2 z 2 ) cos xyz — 3xyz sin xyz 

15. sin (* + y + z) 16. —cos (* + y + z) 


17. 


48 xyz 

(.x 2 + y 2 + z 2 ) 4 


18. 6x 2 y 


Chapter VI, §2 

1. 9 D\ + 12Z>iZ> 2 + 4D 2 

2. D\+ d\+ Dl + 2 D x D 2 + 2D 2 D 3 + 2D\D% 

3. d\ - D 2 4. D 2 + 2DiD 2 + D 2 2 

5. D'i + 3 d\d 2 + 3D\D\ + D 2 

6. D\ + Ad\d 2 + 6D 2 D 2 + 4D!Z)I + Da 

7. 2D? - DiD 2 - 3 D% 8. DiD 2 - D 3 D 2 + 5DiD 3 - 5D 2 


■&)' 
“■•(£) 


' (c)‘ + (e) 


+ j ee + (e) u -- ! (e) + ™E 


+ 12 


+ 2 Mff + 4 2 

d* dy 


dy 


13. 8 14. 4 15. 4 16. 1 17. (a) 2880xy (b) 0 (c) 144 (d) 1440 

18. (a) 0 (b) 3 • 7!9! (c) 11 • 7!9! (d) 0 

19. (a) 576 (b) -7 -914! (c) 6 (d) 0 

20. (a) 0 (b) 48 (c) 7 • 611017! (d) 0 


( 9 -) 2 +1 

\drj r dr 86 
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Chapter VI, §4 

1. xy 2. 1 3. xy 4. x 2 + y 2 

2 2 2 

5. l-\-x + y + — + xy -\-^2 6. 1 — ~ 1.x 

8. y + xy 9. x + xy + 2y 2 10. Yes, 0 11. (a) Yes, 0 (b) Yes, 1 

2 2 3 2 

12. Yes, 0 13. Yes, 0 14. 1 + * + ^ 15. 0 

2 2 6 2 

17. Terms up to degree 2 given in text. Term of degree 3 is + 2y) 3 . 

18. (1) -tt(x — 1) + (y — 7r) 

2 

(2) -1 + y (x - l) 2 + Ax - 1 )<y - t r) 

(3) log 7 + f (x - 2) + f(y - 3) - *(jc - 2) 2 
+ AC* - 2)(y - 3) - s\(y - 3) 2 


(4) 2 vV(jc — 7 r) + 2y/w(y — vV) + (jc — vV ) 2 + (y — vV) 2 

3 

(5) e 3 -f- e\x - 1) + e\y - 2) + y (jc - l) 2 

3 

+ e 3 (x - l)(y - 2) + y (t - 2) 2 


(6) -i + Uy ~ *) 2 

(7) i - K* - tt/2) 2 - Hy - *) 2 

2, /r 2 


( 8 ) 


2 V 2 . eW2 , . ^ 2 v"2 


4- 


(jc — 2) 4- 


4- 


4- 


2 

?V2 

4 

e 2 V2 


(y - ir/2) 

2 \/? 

(x - 2) 2 4- —(* - 2)(y - ?r/4) 


(t - t/4) 


(9) 4 4- 2(x - 1) 4- 5 (y — 1) 4- (jc — l)(y - 1) 4- 2 (y - l) 2 
20. (a) X 4- t(Y - X), 0 ^ t ^ 1 

(b) By the mean value theorem applied to the function 


g(/> =/(Y4- t(Y- X)), 

we get 

f(Y)-f(X) = (grad/(Z))-(F- Y) 


for some Z on the line segment. Now use the Schwarz inequality. 


Chapter VI, §4 

1. First observe that for each point X we have 

f(X)-f(0)= f 1 Df(tX)dt, 

Jo 

where D = xiDi + • • • + x n D n . Assuming that f(O) = 0, and repeating 
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the argument, assuming that V/( O ) = 0, we obtain 

f(X) = f 1 C tD 2 f(stX)dsdt. 

Jo Jo 

Thus we find 

n 

f(X) = Z hij(X)XiXj, 

t,j= 1 

where 

hij(X) = f f tD{Djf (stX) ds dt, if / ^ j, 

Jo Jo 

i f f & D iDjf(stX)ds dt, if i = j. 

riji(X) = Jo Jo 

We have /?;>• = h ,•< because D,D ; = Z),Z) t . 

C/mprer K/7, §7 

1 . (2, 1), neither max nor min 

2. ((2 n + l)7r, l) and ( 2nir , —1), neither max nor min 

3. (0, 0, 0), min, value 0 

4. ±(a/2/2, a/ 2/2), neither local max nor min. [Hint: Change variables, 
letting u — x + y andu = x — y. Then the critical points are at ±(\/2, 0), 
and in the («, u)-plane, near these points, the function increases in one 
direction and decreases in the other.] 

5. All points of form (0, 1 , —t), neither max nor min. 

6. All (x, y, z) with x 2 + y 2 + z 2 = 2nir are local max, value 1. 

All (x, y, z) with x 2 + y 2 + z 2 = (In + l)7r are local min, value — 1. 

7. All points (x, 0) and (0, >>) are mins, value 0. 

8 . (0, 0), min, value 0 9. (/, /), min, value 0 

10. (0, /z7t), neither max nor min 11. (1/2, 0), min, value —1/4 
12. (0, 0, 0), max, value 1 13. (0, 0, 0), min, value 1 

Chapter VII, §2 
3. (1) x 2 + 4 xy - y 2 

(2) At ((2 n + l)7r, l), — xy. At (2mr, —1), +-xy. 

(3) ;c 2 + j 2 + z 2 

(4) - e- 112 + 3^ + ^ at (V2/2, V2/2) 

+ 3 ^ + t) at (-^2/2,-V2/2) 

(5) xy + xz 

(6) At (a, b, c ) such that a 2 + b 2 + c 2 = 2mr, the form is 

— 2 (a 2 x 2 + b 2 y 2 + c 2 z 2 ) — 4 (abxy + acxz -f bcyz). 

At the point (a, b, c ) such that a 2 + b 2 + c 2 = (2 n + l)7r, the form is 
2(a 2 x 2 + b 2 y 2 + c 2 z 2 ) + 4(abxy + acxz + bcyz). 
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(7) At points (a, 0) we get a 2 y 2 . At points (0, b), we get b 2 x 2 . 

(8) y 2 (9) 0 (10) (11) x 2 + 2y 2 

(12) -x 2 - t 2 - z 2 (13) x 2 + y 2 + z 2 

4. (a) Neither (b) Min (c) Max (d) Neither (e) Neither 
(f) Neither (g) Max (h) Neither 

Chapter VII, §J 

1. Min = —2 at ( — 1, —1), max = 2 at (1, 1) 2. None 

3. Max \ at (\/2/2, \/2/2) and ( — y/2/2, —\ / 2/2) 

4. Max at (^, ^), no min 

5. Min 0 at (0, 0), max 2/e at (0, ±1), rel. max at (±1, 0) 

6. Max = 1 at (0, 1), min = 1/9 at (0, 3) 

7. (a) Both (b) Neither (c) Neither (d) Min (e) Both (f) Max (g) Min 

8. t = (2 n + l)7r, so ( — 1, 0, 1) and (—1, 0, —1) 

Chapter VII, §4 

1. (a) -1/V2 (b) 9/8 2. 1 + 1/V2 3. At (f, f, J) min = 12 

4. X = ^A+_B + C), min value isJ (A 2 + B 2 + C 2 - AB - AC - BC ) 

5. 45 at ±(V3, V6) 6 . (§) 3/2 at Vf (1, 1, 1) 7. Min 0, max 0 

8. Max at (7 t/8, —t/8 ), value 2 cos 2 (7r/8); min at (57r/8, 3tt/8) value 
cos 2 (57 t/8 ) -f cos 2 (37 t/8 ) 

9. (0, 0, ±1) 10. No min, max = \ at (^, %) 11. 1 

12. Max = \/3 at \/3/3(l, 1, 1), min = — x/3 at — v 7 3/3(1, 1, 1). 

13. Values 3 and -3 at (£, -§, f) and (-^, f, -f) 

14. 2\/3 at (x, x, x) with x = x/4/3 15. Max = 1/27, no min 

16. Max = 121/49, min - 0 17. 25/62 18. d 2 /(a 2 + b 2 + c 2 ) 



Chapter VIII, §i 


i. ^ + « = \ J). 

2a+b ~{-\ t 3- 
-“-(.j:; 3’ 
2 - ^+ b =(° _3 ■ 

4 + 2B - ('2 -3 ’ A 


2. A + B 


-10 4 > 

-2 2 , 


-8 7\ 

-2 4 ’ 


y4 + 2B 


A - B = 


3S "( 3 ^ - 3 ) 

^ + ""i) 

(- 2 =? 3 

— -(-J ?=3 

- (-1 J) - 2 * - (5 '3 ■ 
= (2 "3 ■ B " - (: 2 -4) 


/i -i\ /-1 1 

3. */l = ^2 Oj , ^ 5 1 
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4 - >A = (-1 i ) ' - ( 1 -") 7 - SanK 

*• (0 -2) ’ same »• A + ' A ~ (1 2) • 


11. Rows of A: (1, 2, 3), (-1, 0, 2) 
Columns of A: 



B + l B 




Rows of B: (-1, 5, -2), (1, 1, -1) 


Columns of B : 



12. Rows of A: (1, -1), (2, 1) 
Rows of B: (—1, 1), (0, —3) 


Columns of A: 
Columns of B: 



Chapter VIII , §2 
1. IA = AI = A 2. 0 

3 - (a > (4 ?) (b > (!!!) (n 

*■"-(5-9- 



If C = xl, where a: is a number, then AC — CA = xA. 



7. (3, 1, 5), first row 8. Second row, third row, i-t h row 9. O 10. O 


(0 0 1\ 

11. (a) A 2 = I 0 0 0 ) , 

\0 0 0 / 


A 3 = O matrix. If B = 


'0 111 ^ 
0 0 11 
0 0 0 1 
vO 0 0 0/ 


then 


/0 0 
[00 
l 0 0 
\0 0 



B 3 = 


( 000 
0 0 0 
0 0 0 
0 0 0 



and 2? 4 = O. 


(b) A* = (0 1 2 V 

\0 0 1 / 


Z 1 3 

A 3 = (0 1 
\0 0 



/I 4 10\ 
A* = (0 1 4j 
\0 0 1 / 


X2. (a) Q (b) ( 3 ) (c) (») (d) («) 

I 3 . (a) ( 7 ) (b)(1) (c)(1) (d)( 4 6 ) 
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14. (a) 




/5\ 
(c) I 4 ) 

W 



15. Second column of A 16. /-th column of A 17. y'-th row of A 

18 ' (J " + ?) • (o “) 2# - (<> ~‘) ”• {ABr ' - B ~ lA ~ 


22 . 




for any a, b ^ 0; if b = 0, then 



23. A" = ( C0Sn l - sin "^ 24. ( ° 

\sin nO cos rid) \ — 1 0/ 

/I 0 0\ /I 0 0\ /I 0 0\ 

25. 10 4 0 , 0 8 0,0 16 0 

\0 0 9/ \0 0 27/ \0 0 81/ 


26. Diagonal matrix with diagonal a\ k , a 2 k , ■ . ■ a n k 27. A 3 = 0 


Chapter IX, §7 

1. (a) 11 (b) 13 (c) 6 2. (a) (c, 1) b) (1,0) (c) (l/e, -1) 

3. (a) 1 (b) 11 4. Ellipse 9x 2 + 4 y 2 — 36 5. Line x — 2y 

6. Circle x 2 + y 2 — e 2 , circle x 2 + y 2 — e 2c 

7. Cylinder, radius 1, z-axis = axis of cylinder 8. Circle x 2 + y 2 = 1 
12. A — O 


Chapter IX, §2 


1. (a) (5, 3) (b) (5, 0) (c) (5, 1) (d) (0, -3) 


2 . 








6. (1, 0, 0) 




(ax 0 0\ 

4. ( 0 a 2 0 ) 
Vo 0 aj 



0 0\ 
1 o) 




0 0 0\ 
1 0 o) 


(\ 0 0 0\ 

10 . (0 1 0 0 ) 

Vo 0 1 of 


11. Only A = 0. 




17. Let A — L( 1). For any number t, we have by linearity, L(/) = L{t • 1) 
rL(l) = tA. 


ail 

a 12 

a2\ 

<322 

<331 

<332 


18. 


19. 


3 -5' 
1 7 

-4 -8, 
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Chapter IX, §J 
3. L(E') = Q , L(E 2 ) = 

5. It is the set of all points 

t\A + t2B + tsC, 

with numbers u satisfying 0 ^ t, ^ 1 for i = 1,2,3. Let S be this paral¬ 
lelepiped. The image of S under L is the set L(S ) consisting of all points 

tiL(A) + t 2 L(B) + t s L(C), 

with ti satisfying the above inequality. Hence it is a parallelepiped if L(A), 
L(B), L(C) do not all lie in a plane. 



7. The three column vectors of the matrix 

8. It is the set of points L(P) + tL(A ) with all t in R. 

9. (a) P + t(Q - P) (b) L(P) + tL(Q - P) = L(P) + t[L(Q ) - L(P )] 

10. It is the set of points tL(A ) + sL(B), with t, s in R. 

11. It is the set of points L(P) -T tL(A ) + sL(B ) with t, s in R. 

Chapter IX, §4 

1. Inverse of F is the map G such that G(X) = (1/7)^. 

2. G(X) = (-1/8)^ 3. G(X) = c~ l X. 

4. (AB)- 1 = B~ l A~ l \ ( ABC)~ l — C~ x B~ l A~ l . Just multiply out 

ABB~ l A~ l = 1 and ABCC~ l B~ l A~ l = I. 

The same also holds taking the multiplication on the other side. 

5. (/ + A)(J - A) = (/ - A)(I + A) = I 2 - A 2 = / so / + A is an in¬ 
verse for / — A. 

6. / = A(—2I— A), so —(21 + A) is an inverse (it commutes with A). 

7. We have (I - A)(I + A + A 2 ) = (1 + A + A 2 )(I - A) = / - A 3 = I, 

so I -\- A + A 2 is an inverse for I — A. 

Chapter X, §7 

1. (a) 26 (b) 5 (c) -5 (d) -42 (e) -3 (f) 9 

2. 1 3. (a) 1 (b) -1 (c) (d) 0 5. D(cA) = c 2 D(A). 

Chapter X, §2 

2. (a) -20 (b) 5 (c) 4 (d) 5 (e) -76 (f) -14 

3. (a) 140 (b) 120 (c) -60 

4. abc 5. (a) 3 (b) -24 (c) 16 (d) 14 (e) 0 (f) 8 (g) 8 (h) -10 
6. ai 1^22033 both (a) and (b) 
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Chapter X, §J 

5. (a) -20 (b) 7 (c) 4 (d) 5 (e) -76 (f) -14 

6. (a) 1 (b) -42 (c) 0 (d) 0 (e) 12 (f) 14 (g) 108 (h) 135 (i) 10 

7. a\ 1022^33 

8. (a) 0 (b) 24 (c) -12 (d) 0 (e) 27 (f) -54 (g) -21 (h) -4 

(i) 5 (j) 0 (k) -18 (1)0 

9. D(cA) = c 3 D(A) 

14. 1 15. t 2 + 8/ + 5 


Chapter X, §4 

1. If a number x is such that B = xA, then * 

D(A, B, C) = D(A , xA, C) = xD(A, A, C) = 0, 
contrary to assumption. 

5. Let x, y, z be numbers such that xA + yB + zC = 0. Then 
0 = D(0, B, C) = D(xA + yB + zC, B, C) 

= xD(A, B, C) + yD(B, B, C) + zD(C, B, C) 

= xD(A, B, C). 

Since D(A,B,C) ^ 0 by assumption, it follows that x = 0. A similar 
argument computing D(A, O, C ) and D(A,B, O ) shows that y = 0 and 
z = 0. 


Chapter X, §6 
1. (a) (J I) 

\ 9 9/ 

- be \- 


(b) 


(i -i) (_f 4) (0) ( 


4 X' 

5 5 
3 

5 5 > 


2 . 


ad 

Chapter XI, §2 

lm (a) (2xy i 2 ) 


d —h 
c a t 




yz 

2 xz 


X 

sin xy 

xz 
0 


0 

— x sin xy 


) 


(C) 


r ye T ' J xe xy \ 
\\/x 0 / 


xy 

v2 


) 


(0 


2. (a) 


f yz cos xyz xz cos xyz yx cos xyz 
K z 0 x 


) 


4. (a) 



-1 

7T • 7T 2 

— - sin — 

2 2 

0 \ 
7T 2 J 

-Trsin—y 

(c) 

(r 

<e > (_i 

-2-2\ /8 
0 4/ w \4 

47r 

0 

/— y sin xy 

—x sin xy 



I y cos jcy 

x cos xy 

° 


\ z 

0 

x/ 



o J ) 

) 


2tt 

TV 
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5. Af(A') = x 2 — 2xy. A f(X) = 0 when x = 0, y arbitrary, and also at all 
points with x = 2y. 

6. Af(X) = — x cos x sin xy 

7. l C ° S « r ^ > r i determinant vanishes only for r = 0. 

\sin 6 r cos 6/ 

( sin ip cos 9 — r sin <p sin 9 r cos <p cos 9\ 

cos <p 0 — r sin «p 1 

sin ip sin 9 r sin ip cos 9 r cos <p sin 9/ 

Determinant r 2 sin <p 

g / e r cos 9 —e T sin 9\ 

\c r sin 9 e T cos 9 ) 

Determinant is e 2r . F(r, 9) = F(r, 9 + 2i r). 


Chapter XI, §4 

1. Yes in all cases 2. (a), (b), (c), (d) all locally C Mnvertible 
3. F(x, y) = F(x, y + 2t r) 

5. Letting y = ip(x), we have 


^" (X) D 2 f(x,y) .2 


D 2 f{x,y)(Dif(x,y) + D 2 D l f(x,y)<p'(x )) 
- Di/(x, y)(DiD 2 f(x, y) + D 2 f{x, y V(x)) 


6. (a) No (b) Yes (c) Yes 

9. (a) We have 2x — y — xy' + 2-yy' — 0. This yields 

*>'(!) = 0. 

(b)*'(l)=l (c)?'(l)=-$ (d) ^'(-1) = (e)*>'(0)=-l 

(0 <f'( 2 ) = 

10. (a) both —1 

(b) £>^(0,0) = 0; D 2 <p(0, 0) = 0 

(c) Z) 2 «p(l,l) = i 

(d) dm^A) = -I; dmoA) = -l 

11. Dup = D 2 ip - 

12. (a) Dvpiid, -1) = -1; D 2 *>(0, -1) = -1 

(b) Dvp(0,0) = 0; D 2 <p(0, 0) = — 1 

(c) Dvp(\,2) = -1; D 2 ^(l,2) = 3 

(d) DkpAA) = -f; Z) 2 ^(i|) = -1 


Chapter XII, §2 

1. (a) 12 (b) (c) ^ (d) 2 + 7r 2 /2 (e) f (f) tt/ 4 (g) f 

(h) ||t r (i) 3 

3. (a) i - tt (b) c - 1/e (c) t 2 - ^ (d) -§f 

( a ) 2l) (b) ITS ( c ) 4 

6. (a) (b) c- 3 /3 - e~ 2 /2 - e 3 /3 + e 2 /2 

(c) 1 - cos 2 (d) 0 (e) 1 (f) ±- 

7. 3tt/8 

8. (a) log 2 (b) £ (c) 7T (d) (e) log ff 9. yfg 
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Chapter XII, §3 

1. (e — l)7r 2. 37r/2 3. 7r(l — e ~ a2 ) 4. tt 5. 2ka 4 /3 6. 7>kira i /2 
7. ibr/4 8. 7ra 2 9. 7ra 4 /8 10. a 3 \/2/6 11. a 2 (7r + 8)/4 
12. fl 3 (157r + 32)/24 13. 2a 2 

i4. ^ “ (- ^ 15 ' 2 rt-(a 2 + 1)- 1/2 + 1] 

+ Limit = * /2 

17. Trgffa 18. (a) -5 tt/ 4 (b) If™ 4 19. (a) 3x/4 (b) 0 (c) 0 

i — n-\-2 — n-\- 2” 

20. (a) 2tt -——r—— if n ^ 2 

— n + 2 

27r[log b — log a] if n = 2. 

(b) The integral approaches a limit of n — 0, 1 

Chapter XII, §4 

1. §7r<2 3 2. 0 3. (a) ka i iv (b) 27 t(1 — a 2 ) 4. 2irk(b 2 — a 2 ) 5. irba*/4 

r 1 r 4 

6. k-jra^/2 7. tt/8 8. 2tt -£(1 - rlf 2 + - - y 

2 — 1 + V 5 

where r 0 =-—— 

9. fa 3 (37r - 4) 10. 7ra 3 11. (a) tt/ 3 (b) 2 tt\/2/3 (c) tt/ 2 (d) tt/ 32 
12. (a) 25 (b) 15/2 (c) la 2 b 3 /3 13. 64/3 

”, 3 —n _ 3—n" 

14. (a) 4 tt ----- if n ^ 3 

3 — n 

47r[log A — log a] if n = 3. 

(b) The integral approaches a limit of n = 0, 1, 2. 


Chapter XII, §5 


1. (1,5/3) 2. (5/2,2) 3. (o,|^ 4. (1,-415) 5. (tt/2, tt/8) 

_ 7r , ttV^ /- _ a/ 2 -f- 1 

6.* = - + —-i-V 2 , y-~— 

_ 2a' 2 log a — a 2 + 1 _ _ a (log af 

X 4 (a log a — a + 1) ^ 2 (a log a — a + 1) 


8. (0,0, fA) 9. (al-f^TT (b) ( — » — 


21 96 ’ 

10 ’ 257t> 


10. (a) \kirh 2 r 2 (b) f h 11. (a) (b) (0,0) (c)(^»~) 

12. %ka 4 hw 13. ^0, 0, y j 
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Chapter XIII, §i 

1. (a) 7 (b) 14 2. (a) 14 (b) 1 

3. (a) 11 (b) 38 (c) 8 (d) 1 4. (a) 10 (b) 22 (c) 11 (d) 0 
Chapter XIII, §2 

1. t xab 2. §tt abc 3. (a) 29 3/4 (b) r 3 ' 4 4. (a) 33 3/5 (b) r 3/5 
Chapter XIII, §3 

1. t r 2. (a) (b) 0 3. (a) 42 (b) 120 4. 2 5. | 6. tt ah 7. 

8. 1500tt 9. 15 ?Tab 11. 0 


Chapter XIII, §4 

/I 0 0 

1. (a) [ 0 p cos p sin p ) and determinant is p. 

\0 — p sin p cos pi 

(b) /// = Jff f(e,r,z)dzdrd9 

A G(A) 

2. abck 3. ^zrabc 4. ^ra 3 • 14 

5. (a) i (c) 1 (d) i 6. (a) £ (b) f 

Chapter XIV, §/ 

1. (a) —4 (b) 4 (c) 47t (d) 7r (e) 8 (f) 7 tg6 3. 0 
Chapter XV, §7 

1. — = (— (a + A cos <p) sin 0, (a + b cos <p) cos 9, 0) 
do 


dx 


dd 


= a + b cos p 


dx 

dp 


dX 


— ((a — b sin p) cos 6, (a — b sin p) sin 9, b cos p) 


dX 

dp 


= + 6 2 — lab 


sin p 


2. — = (— z sin a sin 6 , z sin a cos 6 , 0) 
do 

dX 

-r— = (sin a cos 9, sin a sin 9, cos a) 
dz 

dX dX , . n . a . 2 , 

—— X —— = (z sin a cos 9 cos a, z sin a sin 9 cos a, — z sin a) 

d9 dz 


dX dX 

—— X -r— = z sin a 

d0 dz 

Equation of surface is x 2 + y 2 — (tan a) 2 z 2 . 
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dX 

3. - = (a cos 9, a sin 9, 2t) 

at 

dX . „ „ 

—— = { — at sin 9, at cos 9, 0) 

dX dX 2 . _2. n 2 x 

— X —-- = {—2at cos 0, —2at sin 6, a t) 
dt o9 


dX dX 
~di X 99 


= V 4a 2 ? 4 + a*t 2 


The equation is x 2 + y 2 = <srz. 


= Vc 2 Z > 2 sin 4 ij ? 4 cos 2 9 + a 2 c 2 sin 4 <p sin 2 9 -f a 2 b 2 sin 2 ip cos 2 97 


dX 

4. —-— = (a cos <£> cos 0, b cos sin 0, — c sin p) 
dp 

dX 

—— = (—a sin p sin 0, b sin <£> cos 0, 0) 
at? 

6A" . 2 ^ .2 n L • \ 

X —— = (cA sin cos 0, ac sm sin 0, ab sin p cos p) 
dp dd 

dJt &X 
dp X <90 

2 2 2 

xyz 

The equation is — + — + — = 1. 

a 2 b z c z 

dX 

5. —— = (—a sin 0, a cos 0, 0) 

O0 

dX , n . n ^ 

—— = (a cos 0, a sm 0, 1) 
oz 

d* d* , n n 2 X 

~dd X ~d7 = " ^ C ° S 69 a Sm 05 a 

AY dZ 
30 X dz 


— V / a 2 + a 4 


6 . 


The equation is x 2 + y 2 = a 2 
dX 


dr 

dX 

dd 


(cos 9, sin 9, f'(r)) 

{ — r sin 9, r cos 9, 0) 


™ X || = (-/'(/> cos 9, —f'(r)r sin 9, r ) 


AY 3Y 
dr X 90 


= rV/'(r) 2 + 1 


Equation is z = /(Vat 2 + jy 2 ). 
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Chapter XV, §2 

1. (a) tiV2 (b) ±gvh 2 2. ~ (5V5 - 1) 3. 2tt(\/3 - £) 

6 

4. -§ 7 r( 2 V 2 - 1) 5. le 2 arcsinh 1 - ^“ 2 arcsinh 1 + \ sinh 1 
6. 4V6 7. 2V27T 8. 2 tt(1 - V2/2) 9. 4n 2 a 


Chapter XV, §J 

1. 47rtf 4 /3 2. 7ra 5 /2 3. 47ra 6 /15 4. 7ra 7 /3 5. — (25V5 - 11) 

60 

6. 0 7. 0 8. TTtf 3 9. ™ 4 /2 12. 4 tt/3 13. tt 2 /4 + 2tt 14. 4tt 
15. 104/3 16. 2W2 17. ^ (8 - 5V2) 18. 5/12 

19. (a) 271 -a 2 (b) 3 t™ 2 20. 3/2 21. 5tt/4 


Chapter XV, §4 


1. V • F — 2x + xz + 2yz 

V X F = (z 2 — xy, 0, yz) 

2. V-F=- + - + — 

* y z 

V X F = (* log z, —y log z, log y — log x) 

3. V • F = 2x + x cos xy + e x y 

V X F = ( e x z, —e x yz, y cos xy) 

4. V • F = ye xy sin z + e* 2 cos y + ye 1 ' 2 cos x 

V X F = (ze y2 cos a — jce** sin y, e* 2 ' cos z + e yz sin x, 

ze xz sin y — xe xy sin z) 


Chapter XV, §5 

2. 3/2 3. 0 4. 64tt 5. 0 6. 0 7. 16 8. 24 9. 24tt 10. 48tt 
11. 243tt/2 12. 135tt 13. 1/40 
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1. 0 2. -13/6 3. 0 4. 0 5. 0 6. 0 7. -to 1 


Appendix, §/ 

1. f cf(x)dx = cf f(x)dx 

J—TV J—TV 


and 


(f, g + h) = f /(*)[£(*) + /?(*)] ^ = f [f(x)g(x) + f(x)h(x)] 

J—TV J—TV 

= f f(x)g(x)dx + [ f(x)h(x)dx = (/, g) + (/, h). 

J—TV J—TV 


dx 
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2. Take the scalar product with fi. We obtain for each /, 

n 

0 = <Cl/l + • • • + Cnfn, fi) - c k(fk, fi ) = Ci- 

*= 1 


3. If (fn,f) = 0 and (h 2 ,f) = 0, then 

(hi + h 2 ,f) = (h u f) + (h 2 ,f) = 0. 

If c is a number and (h,f) = 0, then ( ch,f ) = c(h,f) = 0. 


4. f(x)g(x)dx^ fix ) 2 dxy 12 (^J g(x ) 2 dxy ' 2 

(y [fix) + g(x)] 2 dxy 12 < ( j fix ) 2 dxy 12 + ^ J gixfdxj 
7. (b) 1/4 (c) ||/|| = and ||g|| = -- (d) 1/2, 1/3, 1 


Appendix, §2 

1. (a) V57T (b) (7T + 7T 3 /3) 12 

„ t , X . sin 2a: 

4. (a) - = sin a:-- - t- * • ■ 


+ (-i ) n+1 sin - + 


/i \ 2 ^ 

(b) a: = — 


^COS X 


(c) \x 


(f) I COS X 


T 4 / 
~ 2 “ w \ 

7T \2 


COS X + 


cos 2x 
_ 22 “ 

cos 3x 


+ 

+ ' 


+ ( -ir +1 ^ + 

cos (2 h + 1)* 

H-, \ o h 


(2/i + l) 2 


cos 2x , 
2 ^ 3 ^ 


r _ 1 .n-1 COS 2/IX 
+ 1 ; 4«2 - 1 + 


) 


■) 


) 


(g) sin 3 x = f sin x — ^ sin 3x 


... 1 cos 2x 
(d) 2 - 2 


(e) |sin x 


7T \2 


cos 2x 


cos 


4/l2 


2nx \ 

~1- ") 


(h) cos 3 x = % cos x + \ cos 3a: 


1/2 
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Acceleration, 40 
Angle between planes, 29 
Angle between vectors, 23 
Area, 241 

Ball, 59 

Beginning point, 9 
Bessel inequality, 344 
Boundary point, 128 
Bounded, 139, 227 

-curve, 89 

Center of mass, 266 
Chain rule, 71, 216 
Change of variables formula, 282, 
288, 301 
Circle, 59 
Closed, 139 
Closed ball, 59 
Closed path, 98 
Column, 148 
Component, 20 
Composite, 176 
Conservation law, 84 
Conservative, 84 
Constraint, 141 
Continuous, 65, 96 
Coordinate, 4 
Critical point, 133 
Cross product, 33 
Curl, 326 
Curvature, 5 1 
Curve, 37 


Curve integrals, 96 
Cylindrical coordinates, 255 

Derivative, 38, 208 
Determinant, 183,275 
Determinant as area and volume, 271 
Differentiable curve, 38 
Differentiable function, 65 
Differential operator, 117 
Differentiating under integral sign, 

90 

Dilation, 161, 278 
Direction, 10 
Directional derivative, 82 
Disc, 59 
Distance, 17 
Divergence, 299, 325 
Divergence theorem, 328 
Dot product, 11 

End point, 9 
Equivalent vectors, 9 
Euler’s relation, 78 
Extremum, 141 

Fourier coefficients, 342 
Fourier series, 348 

Gradient, 62 
Graph, 56 

Green’s theorem, 293 
Hyperplane, 29 
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Identity, 178 
Image, 160 

Implicit function, 220 
Independent vectors, 175, 200 
Injective, 179 
Integrable, 230 
Interior point, 138 
Inverse, 178, 180, 218 
Invertible, 181 

Jacobian determinant, 213 
Jacobian matrix, 211 

Kinetic energy, 85 

Lagrange multipliers, 141 
Laplace equation, 113, 300 
Length, 14 
Length of curve, 45 
Level curve, 56 
Line, 25 

Line segment, 170 
Linear map, 165 
Local max, 134 
Local min, 134 
Located vector, 9 
Lower sum, 229 

Mapping, 159 
Mass, 231,241 
Matrix, 147 
Maximum, 139 
Minimum, 1 39 

Norm, 14 
Normal, 28 

Normal vector to surface, 310 

Open ball, 59 
Open set, 60 
Opposite direction, 10 
Orthogonal, 13 
Osculating plane, 49 

Pappus theorem, 323 
Parallelogram, 172 
Parametrized surface, 303 


Partial derivative, 61 
Partial differential operator, 115 
Partition, 228 
Path, 98 

Perpendicular, 13, 18, 27 
Piecewise , 98 
Piecewise continuous, 346 
Plane, 27 

Plane spanned by vectors, 175 
Point, 4 

Polar coordinates, 162, 246 
Polynomial approximation, 126 
Potential energy, 85 
Potential function, 87 
Projection, 20 

Quadratic form, 135 

Radius of curvature, 51 
Rectangular coordinates, 254 
Regular point, 309 
Repeated integral, 236 
Riemann sum, 229 
Row, 147 

Scalar product, 11, 339 
Schwarz inequality, 343 
Segment, 170 

Simple differential operator, 116 

Smooth, 233 

Span, 175 

Speed, 40 

Sphere, 59 

Spherical coordinates, 258 
Square matrix, 148 
Stokes’ theorem, 334 
Subrectangle, 228 
Subset, 159 
Surface, 79 
Surface area, 313 
Surface integrals, 317 
Surjective, 179 

Tangent line, 39 
Tangent linear map, 209 
Tangent plane, 309 
Tangent vector, 39 
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Torus, 306 
Translation, 163 
Transpose, 150 
Triangle inequality, 23 

Unit matrix, 153 
Unit vector, 17 
Upper sum, 229 


Value, 160 
Vector, 10 
Vector field, 84 
Velocity vector, 39 

Zero matrix, 148 
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